Wow, that’s a deep dive! The distinction you made between evolving the weights versus evolving the Hebbian rules themselves is a real 'lightbulb' moment... It makes sense that letting the neuron adapt its own connections based on environment-specific rules would be much faster than just relying on pre-trained static weights....
I'm curious about the 'Clear as mud' part, specifically the transition from synapse-level parameters to neuron-level rules. Does this change significantly reduce the computational load when you're running it on actual robot hardware? It seems like it would simplify the GA's search space quite a bit!
A Electrical engineer who loves the robots and drones
The idea of evolving the learning rules rather than static weights is brilliant.. it’s like giving the robot a “manual on how to learn” instead of just a fixed memory. I really like the move from synapse level parameters to neuron-level rules; it feels far more efficient in terms of computational load on real hardware. 🙂
It’s great to see this early-2000s research being revisited in the context of a physical robot. I’m really curious how the Fast Adaptive Behavior approach holds up against real-world unpredictability. Looking forward to seeing how this evolves..
A Electrical engineer who loves the robots and drones
Wow, that’s a deep dive! The distinction you made between evolving the weights versus evolving the Hebbian rules themselves is a real 'lightbulb' moment...
Not sure I ever wrote of evolving Hebbian rules? I think the paper @thrandell is hoping to duplicate in some way is about the synapse's weights of a neuron being modified independently of the initially evolved weight values by using Hebbian rules. Although I have found neural networks of great interest all my life I have only played with them as simulations. My spare hobby time on building physical robots is more about trying to hard code a practical working autonomous machine.
Genetic Evolution
I had to dust off Goldberg’s book to get back on track.
D. E. Goldberg. (1989). Genetic algorithms in search optimization and machine learning. Addison-Wesley, Reading, MA.
I still like this quote from the book:
“Genetic algorithms operate on populations of strings, with the string coded to represent some underlying parameter set. Reproduction, crossover and mutation are applied to successive string populations to create new string populations.”
I made use of Goldberg’s Simple Genetic Algorithm (SGA) again for my program that implements this ANN. The biggest deviation was that the chromosome has parameters that aren’t the same size. So that presented a chance for me to hack his code a little. Those parameters are encoded on the chromosome as binary strings for each neuron in the ANN.
With the one bit sign I use a binary ‘1’ to represent a positive and a ‘0’ to represent a negative. The 2 bit adaption rule indicates one of the four possible Hebb rules and the last 2 bits map to an array of learning rates, [0.0, 0.3, 0.6, 0.9].
An interesting point about this neuron based adaptive synapse encoding is that the Hebb rule and the learning rate only apply to the incoming synapses to the neuron, while the sign parameter is the sign of the axon (output) of the neuron.
The fitness function used by the GA is the number of sensorimotor cycles that the robot stays in the gray area under the light while the light is on, divided by the total number of sensorimotor cycles available (500).
My take on how the fitness function is calculated is this.
The GA selects a pair of mates, performs cross-over and mutation to evolve two new individuals then calls a function to return the fitness for each individual. The function will perform the following steps 500 times:
- read the sensors;
- apply the Hebb rules;
- move the data through the ANN;
- run the motors for 100ms;
- increment the fitness count if the robot was over the gray area while the light was on.
After the 500 iterations the final fitness value is returned to the GA.
The experimenters used these SGA parameters on their physical robot.
|
Population size |
80 |
|
Generations number |
40 |
|
Elitism selection |
best 16, clone 5 |
|
Parameter encoding |
binary values |
|
One Point Crossover probability |
0.2 |
|
Biased Mutation probability |
0.05 |
|
Initial synaptic weight range |
[0, 0.1] |
|
Final synaptic weight range |
[0, 1] |
|
Life length |
500 actions |
|
Action duration |
100ms |
The paper mentions that most individuals cannot both turn the light on and move to the area under the light to gather fitness points. “In all experimental conditions (adaptive and genetically determined), most individuals in the population cannot manage to do both things within their life span.” To me this sounds like I could easily get to the point where the Elitism selection process just won’t work. Floreano calls this the bootstrap problem and suggests one possible solution is to increase supervision. For example the experimenter could include genetic selection criteria rewards for completing subparts of the task. Or start the evolutionary process with a simplified version of the task then increase its complexity by modifying the selection criteria (incremental evolution). Which is something I used before here:
An ideal solution “would be a self-organized process capable of producing incremental evolution that does not require any human supervision”. Hmm…
Nolfi, S. and Floreano, D. (2000). Evolutionary Robotics: Biology, Intelligence, and Technology of Self-Organizing Machines. MIT Press, Cambridge, MA.
Time will tell I guess.
Tom
To err is human.
To really foul up, use a computer.
So this might be a deep dive but this forum is supposed to be about Neural Networks after all. I thought it might be useful to explain the code that I came up with to run data through the artificial neural network.
Some background:
An artificial neural network is a collection of units (neurons) connected by weighted links (synapses, connection weights). The neural network that McCulloch and Pitts came up with in 1943 states that the output of a unit Yi is a function f ( ) of the sum of all incoming signals Xj weighted by connection strengths Wij.
The incoming signals Xj are represented by X1 through X5 and the connection weights Wij are represented by Wi1 through Wi5. Sigma is the summation of the weighted signals and here the f ( ) represents the output function sigmoid. Sometimes there is a threshold that has to be reached before the neuron is triggered, but in this experiment there are none.
Because this is a fully recurrent discrete-time neural network every network node has a synapse connection to every other node, including itself.
There are a total of 12 nodes in the network. Ten receive input from the sensors and the remaining two have their output used to control the robot’s actuators (left and right wheels).`
When I run the artificial neural network I’m taking the current node output (activation value) and applying the connection weight (determined by the Hebb rules) to calculate new node output values.
In this bit of code I’m using three arrays:
* nodeActivation[ ] is an array containing the latest node output.
* inputLayer[ ] contains the normalized sensor output. In the experiment the sensor values are added to the node output after the output function is applied, with the exception that the values added to the two actuator neurons are zero.
* synapseWeights[ ] [ ] is a two dimensional array containing connection weights that are updated by the Hebb rules.
void run_ann() {
// Move the node activations and weights through the neural network.
// run_ann() assumes that all neurons, synapses & input have been set with some initial values
// and that hebb rules and learning rates have all been applied to the synapseWeights array.
// The motor neurons need to map their output to a range of speed value, that should happen elsewhere.
// The nodespecs[x].sign should be applied as necessary to the post-synaptic node output here.
float nodeJactivation[NUMNODES]; // temp table holds the starting node output
float activation = 0.0f;
float tempActivation;
for (uint8_t t=0; t<NUMNODES; t++) // collect the soon to be t-1 node activations
nodeJactivation[t] = nodeActivation[t];
for (uint8_t i=0; i<NUMNODES; i++) { // post synaptic
activation = 0.0f;
for (uint8_t j=0; j<NUMNODES; j++) { // pre synaptic
if (i == j)
activation += contextActivation[i]*synapseWeights[i][j];
else
activation += nodeJactivation[j]*synapseWeights[i][j];
} // end of pre-synaptic neuron loop
// Adding the inputLayer is pushing a fair amount of these activations over 1.0, in test
tempActivation = sigmoid(activation) + inputLayer[i]; // last two entries in inputLayer array will be 0 (motor neurons)
if (tempActivation <= 0.0f) {
if (nodespecs[i+1].sign == 1) // positive
tempActivation = tempActivation * -1.0f;
}
else {
if (nodespecs[i+1].sign == 0) // negative
tempActivation = tempActivation * -1.0f;
}
nodeActivation[i] = tempActivation;
} // end of post-synaptic neuron loop
for (int8_t c=0; c<NUMNODES; c++)
contextActivation[c] = nodeJactivation[c]; // move the starting output to the context, (recurrent)
return;
} // end run_ann
There is this notion in neural networks of signal flow from pre-synaptic neurons to post-synaptic neurons. I’m using the same idea in this code snippet. So in the code I loop through every post-synaptic neuron and sum the product of every pre-synaptic neuron’s current output value with the connection weights from each pre-syntactic neuron. The experimenters did change it up by adding the sensor input to the output value of the neuron. Finally the genetically determined sign is applied to the output value. The values from the two actuator neurons will be picked up by another routine that transforms them into wheel speed/direction commands.
Tom
To err is human.
To really foul up, use a computer.
Always an interesting subject although I don't have the time to pursue it myself. In the early days I played around with neural net designs using image inputs. However any input is a list of sensor inputs be they touch, light, sound whatever controls the outputs.
Adaption is being able to change a behaviour under control of some goal. If the i/o is not leading to the goal state the system can modify input/output connections until it does move toward the goal state.
Out of curiosity I asked AI if Dario Floreano’s experiments with evolved recurrent Hebbian networks had any practical outcomes,
AI replied:
Yes, Dario Floreano’s experiments with evolved recurrent Hebbian networks did have practical outcomes, though their impact is different from the direct hardware spin-offs (like the collision-proof drone) mentioned earlier.
These specific experiments, conducted largely in the late 1990s and early 2000s (often with Stefano Nolfi), were foundational to the field of Evolutionary Robotics. Their practical outcome was primarily algorithmic and methodological: they solved a major bottleneck in how autonomous robots could learn and adapt in changing environments without human intervention.
Here is a breakdown of the practical outcomes of those specific experiments:
Solving the "Reality Gap" in Adaptive Learning
Before these experiments, robots evolved in simulations often failed in the real world because simulations couldn't capture the messiness of reality (noise, friction, changing lighting).
The Problem: Standard neural networks had fixed weights once evolved. If a robot encountered a situation it hadn't seen during evolution (e.g., a light sensor breaking or the lighting changing from day to night), it failed.
The Solution: Floreano evolved the learning rules (Hebbian rules) rather than just the weights. This meant the robot wasn't born with a fixed brain; it was born with the genetic ability to rewire its own brain on the fly.
Practical Outcome: This proved that robots could be deployed in environments that were different from their training grounds. This methodology is now a standard reference point for designing controllers for robots that must operate in unpredictable environments (like planetary rovers or deep-sea AUVs) where they cannot be reprogrammed remotely.
"Behavioral Plasticity" Without External Supervision
The most famous experiment involved a predator-prey scenario or a robot navigating a maze where the target zone changed location periodically.
The Outcome: The robots evolved using Hebbian plasticity learned to change their behavior during their lifetime. If a robot went to "Zone A" and found no "food" (energy), the Hebbian synapses would weaken that association, and the robot would spontaneously explore "Zone B."
Practical Application: This demonstrated unsupervised online learning. Practically, this is applied in modern autonomous systems that need to detect faults. If a motor degrades, a static controller fails. A controller based on these plastic principles can "learn" that the motor is weak and adjust its control signals automatically to compensate, extending the machine's operational life.
Influence on Neuromorphic Computing
Floreano’s work demonstrated that simple, local learning rules (Hebbian) combined with recurrent loops could generate complex, memory-dependent behaviors without a central processor dictating every move.
Practical Outcome: This concept heavily influenced the development of neuromorphic hardware (like Intel's Loihi or IBM's TrueNorth). These chips use spiking neural networks and local plasticity rules to process data with extreme energy efficiency. Floreano's experiments provided early biological justification for why local plasticity is superior to global error-correction (like backpropagation) for low-power, autonomous agents.
Application in Vision-Based Navigation
Floreano applied these evolved networks to vision-based navigation on the Khepera robot (a small lab robot).
The Outcome: The robots learned to navigate corridors and avoid obstacles using very low-resolution visual input, adapting to different textures on the walls.
Practical Application: This validated the feasibility of using low-computation neural networks for Micro Air Vehicles (MAVs). It showed that you don't need a heavy GPU to process vision; you can do it with a small, evolving recurrent network. This logic underpins the autopilot software in many modern ultra-lightweight drones.
Summary
While you won't find a product on a shelf called "The Floreano Hebbian Chip," the practical outcome was the validation of a design philosophy.
These experiments proved that adaptive hardware is possible. They moved the field away from "programming a robot to do X" toward "creating a robot that can learn to do X, even if the rules change." This is the foundational logic used today in adaptive control systems for industrial robotics and fault-tolerant autonomous systems.
Since I’m always looking for a new robot controller that I don’t have to hand tune, I thought it might be interesting to try to re-produce their experiment and create a robot controller that can adapt to its environment.
You might consider using ChatGPT as a coding buddy as Bill @dronebot-workshop says he is doing with his two arm robot?
I have been surprised at how good it is.
For example I asked about coding a neural network on an Arduino to allow a robot to travel in a straight line and adapt to changes in the mechanical parts over time. It generated an example but more importantly explains how it works and its limits as well as suggestions as to where you might want to go next with the project.
in another example I asked it about one of your previous projects to evolve a neural network on an Arduino robot that would travel around without hitting anything.
I asked:
Can you write code for an Arduino robot using two motor neurons with inputs from ultrasonic sensors that will learn to drive around without hitting anything using some measure of success for each learning trial such as not hitting an obstacle and how far it travels before hitting some obstacle
ChatGPT replied:
Love this — now we’re building a tiny reinforcement-learning robot 🤖
It then generated code along with explanations and suggestions and finishes each session with where you might want to go next with the project.
I would just like to add the caveat that these AI's can get it wrong. However if you provide the error output or ask questions it will try and resolve its errors.
As an example of ChatGPT in action I asked it to explain your code (it did) and to write a better version.
Let’s write a full, working version of your neural network code in matrix form
....
I then asked for something that would run on an Arduino board.
We can make an in-place, memory-efficient version for small Arduinos (UNO, Nano, etc.), where we don’t allocate any extra matrices like W_off. Instead, we compute contributions on the fly while keeping the diagonal self-weights separate. This saves significant RAM.
Here’s a compact Arduino-friendly version:
🔹 In-Place, Memory-Efficient run_ann() for Arduino
#include <Arduino.h>
#include <math.h>
#define NUMNODES 10 // Adjust to your network size
// ---------- Sigmoid ----------
float sigmoid(float x) {
return 1.0f / (1.0f + expf(-x));
}
// ---------- Absolute value ----------
float absf(float x) {
return x >= 0.0f ? x : -x;
}
// ---------- run_ann in-place ----------
void run_ann_matrix_inplace(
float nodeActivation[NUMNODES],
float contextActivation[NUMNODES],
const float synapseWeights[NUMNODES][NUMNODES],
const float inputLayer[NUMNODES],
const uint8_t nodeSigns[NUMNODES] // 1 = positive, 0 = negative
) {
float prevActivation[NUMNODES];
// 1️⃣ Save previous activations
for (uint8_t i = 0; i < NUMNODES; i++)
prevActivation[i] = nodeActivation[i];
// 2️⃣ Compute new activations
for (uint8_t i = 0; i < NUMNODES; i++) {
float sum = 0.0f;
// 2a. Off-diagonal contributions
for (uint8_t j = 0; j < NUMNODES; j++) {
if (i != j)
sum += synapseWeights[i][j] * prevActivation[j];
}
// 2b. Diagonal contribution (self-connection via context)
sum += synapseWeights[i][i] * contextActivation[i];
// 3️⃣ Activation function + input
float h = sigmoid(sum) + inputLayer[i];
// 4️⃣ Apply sign constraint
float sign = (nodeSigns[i] == 1) ? 1.0f : -1.0f;
nodeActivation[i] = sign * absf(h);
}
// 5️⃣ Update context with previous activations
for (uint8_t i = 0; i < NUMNODES; i++)
contextActivation[i] = prevActivation[i];
}
🔹 Why This is Memory Efficient
-
✅ No extra NxN matrices allocated (
W_offis gone) -
✅ Only one temporary array
prevActivationis needed -
✅ Uses fixed-size arrays compatible with small Arduinos
-
✅ Preserves diagonal self-weights (recurrent memory)
-
✅ Preserves sign constraints (+/- neurons)
🔹 Example Usage on Arduino
float nodeActivation[NUMNODES] = {0};
float contextActivation[NUMNODES] = {0};
float synapseWeights[NUMNODES][NUMNODES] = {
{0.5, -0.2, 0.1, 0, 0, 0, 0, 0, 0, 0},
{-0.1, 0.6, 0, 0.1, 0, 0, 0, 0, 0, 0},
{0.2, 0, 0.7, -0.3, 0, 0, 0, 0, 0, 0},
// ... fill in all 10x10 weights
};
float inputLayer[NUMNODES] = {0};
uint8_t nodeSigns[NUMNODES] = {1, 1, 0, 1, 0, 1, 1, 0, 1, 1};
void setup() {
Serial.begin(9600);
}
void loop() {
run_ann_matrix_inplace(nodeActivation, contextActivation, synapseWeights, inputLayer, nodeSigns);
// Print activations
for (uint8_t i = 0; i < NUMNODES; i++) {
Serial.print(nodeActivation[i], 3);
Serial.print(" ");
}
Serial.println();
delay(100); // simulate timestep
}
This version is perfect for small microcontrollers: it minimizes RAM usage while still being fully recurrent, sign-constrained, and matrix-style.
If you want, I can also optimize it further to eliminate even the prevActivation array by cleverly using the context array for both purposes, which would save even more RAM — ideal for something like an Arduino UNO with 2 KB RAM.
Do you want me to do that next?
Do you want me to do that next?
No thank you.
Your re-writing my code using ChatGPT is not helpful. Please Stop!
Instead why don’t you take one of your unfinished projects and use ChatGPT to finish it.
To err is human.
To really foul up, use a computer.
Do you want me to do that next?
No thank you.
Your re-writing my code using ChatGPT is not helpful. Please Stop!
Instead why don’t you take one of your unfinished projects and use ChatGPT to finish it.
My sincere apologies.
The sentence "Do you want me to do that next?" was actually from chatGPT the way it ends a lot of posts it was not something I would have written. I wasn't trying to take over your project.
You reacted to my post with ChatGPT's analysis of the work of Dario Floreano’s experiments but I guess it was a thumbs down.
Instead why don’t you take one of your unfinished projects and use ChatGPT to finish it.
Well I finished all my one off projects except the ongoing autonomous robot project which will never end as I keep trying to improve it and no one is interested in it anyway so I don't post about it.
I promise not to respond to your posts in future,
Best Wishes
John
The Physical Robot
I decided not to try to re-create the Khepera robot that Floreano & Urzelai used. Instead I re-purposed a robot I used on another project. I managed to re-use the main board and the motors, wheels and motor driver. I did replace the Pico with a Pico_w.
The robot needs to be able to detect ambient light and wall color. So I selected the STMicroelectronics Time of Flight VL53L1X sensor because it can detect distance and ambient light. It also seems pretty versatile (low power, programmable Region Of Interest, supports I2C address change, maximum 27 degrees field of view). To detect wall color I selected the HM01B0 based CMOS image sensor from Arducam. It also is low power and I found a Pico library written by Sandeep Mistry that offers all the functionality I need.
I used a feature of the VL53L1X to divide the back facing sensor into two 8 by16 SPAD (single photon avalanche diode) regions. I’ve combined the left side sensor with the left half of the back sensor to measure left side ambient light and likewise on the right side.
I also have divided the frame collected by the HM01B0 image sensor into three regions. I average the greyscale values for each pixel in the three regions and use them as input to the neural network.
The image sensor has a maximum resolution specification of 320 by 320 pixels, but it also supports Quarter VGA (240 by 320) and Quarter QVGA (120 by 160) resolutions. I’m trying out the Quarter QVGA in an arena that will probably only be 2 feet by 4 feet in size.
There is a reflectance sensor that isn’t used as input to the artificial neural network but as input to a TCP client running on the robot that communicates with a server to turn the light on/off and also to log the fitness of each individual at the end of its life.
Laying out the GPIO wiring took a little tweaking because I wanted to re-use the motor driver wiring and also make it easy to dismantle the sensors from the rest of the robot. The motors and the image sensor must use different PWM channels on the Pico_w so that was a concern too.
I thought it was interesting that the original Khepera robot could only measure the distance to objects up to 4 cm away, while the VL53L1X ToF sensor can only reliably start measuring the distance to objects beyond 4 cm. Ha.
I used 28 AWG wire for the serial I2C data and clock bus lines. These worked out pretty well with the crimp connectors I had because I could just fit two wires in each connector housing. Here’s a photo of the terminating connection at the image sensor.
The external power (5+V) attaches on the right side and passes through the left side to power the Pico_w and the N20 gear motors. On the Adafruit Perma-Proto board I’ve mounted a voltage regulator to drop the voltage down to 3.3V for the HM01B0 image sensor. The other sensors are powered from the Pico_w 3V3 (OUT) GPIO.
Proof of Concept
Before I finalized the sensors I had a number of proof of concepts to get through. Namely:
- can the reflectance sensor detect the difference between white, grey and black flooring?
- is there a way to re-assign the I2C bus address on the VL53L1X sensors?
- can the image sensor and the Time of Flight libraries play nice on the same Pico_w?
- can the ambient light sensors detect the light from a white LED out to 1 meter?
- can the image sensor distinguish a black stripe on a white wall in low ambient light?
- can the Pico_w support all this and a TCP client? I’m still working on this one.
I’m new to the Networking Libraries and the Lightweight IP stack that are part of the Pico SDK, so I’m making my way through the tcp_client.c and tcp_server.c example programs to see what I can leverage to do this last PoC. I might have the tcp_server write to an SD card…
Tom
To err is human.
To really foul up, use a computer.







