The kind of visual software I understood and wanted to write myself to give vision for a robot was done by RoboRealm. There is no need to buy extra hardware you can run it on a pc using a webcam.

"The RoboRealm application was created back in 2006 to take advantage of (1) lower cost generic computing (i.e. PCs), (2) a widening range of lower cost imaging devices, (3) an increasing need and usage of vision as primary sensor device and (4) the desire to quickly research custom solutions using an interactive user interface with minimal programming."

There is a 15 day trial if you want to play with it.  I don't actually use it myself so this is not meant as a recommendation, just saying that CNNs are not required for some kinds of visual processing needs.