McCulloch-Pitts Neurons and Linear Separability • TheAnig

First homework for CSE 5526: Introduction to Neural Networks at Ohio State, taught by Prof. DeLiang Wang. This one was all about the McCulloch-Pitts (M-P) neuron model and linear separability.

The M-P Neuron Model#

The M-P neuron is about as simple as it gets. It takes binary inputs $x_i \in \{-1, 1\}$ , computes a weighted sum, and applies a hard threshold:

$y = \varphi\left(\sum_{i=1}^{m} w_i x_i + b\right)$

where $\varphi(v) = 1$ if $v \geq 0$ , and $\varphi(v) = -1$ if $v < 0$ .

Simple as it is, you can implement any Boolean function by wiring enough of these together. The homework had three problems that built up to that idea.

Problem 1: Conditional Output#

Design an M-P neuron with three inputs $x$ , $y$ , $z$ whose output is $z$ when $x = -1$ and $y = 1$ , and $-1$ otherwise.

If you work through the truth table, the neuron only needs to activate when $x = -1$ , $y = 1$ , and $z = 1$ . Every other combination should give $-1$ . That gives us:

$w_x = -1, \quad w_y = 1, \quad w_z = 1, \quad b = -3$

Why $b = -3$ ? Because when $x = -1$ , the weight $w_x = -1$ contributes $+1$ , and $w_y$ and $w_z$ each contribute $+1$ in the target case. That sums to $3$ , which exactly cancels the bias: $3 + (-3) = 0 \geq 0$ . Any other combination of inputs can’t get the sum above the threshold.

M-P Neuron for Problem 1 — M-P neuron with weights and bias for the conditional output problem

Problem 2: Binary Adder from M-P Neurons#

This one was more involved. The problem asks you to design an M-P network that adds two 2-bit binary numbers: $uv + wx = yz$ , where $u, v, w, x$ are binary inputs and $y, z$ are the two low-order output bits.

I ended up approaching it as a classic cascading adder:

A half adder for the least significant bits ( $v + x$ ), producing sum bit $z$ and a carry
A full adder for the most significant bits ( $u + w + \text{carry}$ ), producing sum bit $y$

Half Adder Circuit — Half adder and full adder circuits. The sum bits from these directly correspond to our z and y outputs.

Full Adder Circuit — Half adder and full adder circuits. The sum bits from these directly correspond to our z and y outputs.

The interesting part is that the sum output of each adder stage is an XOR, which a single M-P neuron can’t compute (it’s not linearly separable). So you need at least two neurons for each XOR. The carry is just an AND gate, which one neuron handles fine.

M-P Network generating z — M-P network that generates z (least significant bit)

M-P Network generating y — M-P network that generates y (most significant bit)

What I found satisfying about this problem is how it shows that neural networks and logic circuits are really doing the same thing at different levels of abstraction. You’re implementing AND, OR, and XOR gates with weighted sums and thresholds.

Problem 3: Linear Separability#

Given three classes of 2D points:

$C_1: \{(4,1), (2,3), (3,5), (5,4), (1,6)\}$ $C_2: \{(0,2), (-2,2), (-3,2), (-2,4)\}$ $C_3: \{(1,-2), (3,-2)\}$

The question: can a single-layer perceptron with three outputs (one-vs-all) separate them?

Part (a): Yes. If you plot these points, you can draw two lines that separate all three classes. Each output neuron implements one linear boundary, so a single-layer perceptron with three output units works here.

Plot of all points — The three classes plotted in 2D

Linear separability demonstration — Two decision boundaries separate all three classes

Part (b): Add $(-1, 6)$ to $C_1$ , and now you’re stuck. This point sits too close to $C_2$ for any pair of linear boundaries to correctly classify everything with a one-vs-all strategy.

Breaking linear separability — Adding (-1, 6) to C1 breaks linear separability

What I Took Away#

The main thing from this homework was getting comfortable with the M-P neuron model and seeing where it breaks down. A single neuron can only separate data with a hyperplane, so if your classes aren’t linearly separable, you’re out of luck with a single layer. The binary adder problem was a nice way to see that you can compose neurons into networks to get around that limitation for Boolean functions specifically, but the general case (like part (b) of problem 3) needs something more.