All the software written for this project is in Python. I’m not an expert python programmer, far from it but the huge number of available libraries and the fact that I can make some sense of it all without having spent a lifetime in Python made this a fairly obvious choice. There is a python distribution called Anaconda which takes the sting out of maintaining a working python setup. Python really sucks at this, it is quite hard to resolve all the interdependencies and version issues, using ‘pip’ and the various ways in which you can set up a virtual environment is a complete nightmare once things get over a certain complexity level. Anaconda makes that all managable and it gets top marks from me for that.
The Lego sorter software consists of several main components, there is the frame grabber which takes images from the camera:
Scanner / Stitcher
Then, after the grabber has done it’s work it sends the image to the stitcher which does two things: the first thing it does is determine how much the belt with the parts on it has moved since the previoue frame (that’s the function of that wavy line in the videos in part 1, that wavy line helps to keep track of the belt position even when there are no parts on the belt), and then it will update an in-memory image with the newly scanned bit of what’s under the camera. Whenever there is a vertical break between parts the stitched part gets cut and the newly scanned part gets sent on.
After the scanner/stitcher has done its job a part image looks like this:
Stitching takes care of the situation where a part is longer than what fits under the camera in one go.
This is where things get interesting. So, I’ve built this part several times now, to considerably annoyance.
The first time I was just using OpenCV primitives, especially contour matching and circle detection. Between those two it was possible to do a reasonably accurate recognition of parts as long as there were not too many different kinds of parts. This, together with some simple meta data (l, w, h of the part) can tell the difference between all the basic lego bricks, but not much more than that.
So, back to the drawing board: enter Bayes. Bayes classifiers are fairly well understood, you basically engineer a bunch of features, build detectors for those, create a test-set to verify that your detector works as advertised and you try to crank up the discriminating power of those features as much as you can. This you then run over as large a set of test images as you can to determine the ‘priors’ that will form the basis for the relative weighing of each feature as it is detected to be ‘true’ (feature is present) or ‘false’ (feature is not present). I used this to make a classifier based on the following features:
cross (two lines meeting somewhere in the middle)
circle (the part contains a circle larger than a stud)
edge_studs (studs visible edge-on)
full (the part occupies a large fraction of its outer perimeter)
holes (there are holes in the part)
holethrough (there are holes all the way through the part)
plate (the part is roughly a plate high)
rect (the part is rectangular)
slope (the part has a sloped portion)
skinny (the part occupies a small fraction of its outer perimeter)
square (the part is roughly square)
studs (the part has studs visible)
trans (the part is transparent)
volume (the volume of the part in cubic mm)
wedge (the part has a wedge shape)
And possibly others… This took quite a while. It may seem trivial to build a ‘studs detector’ but that’s not so simple. You have to keep in mind that the studs could be in any orientation, that there are many bits that look like studs but really aren’t and that the part could be upside-down or facing away from the camera. Similar problems with just about every feature so you end up tweaking a lot to get to acceptable performance for individual features. But once you have all that working you get a reasonable classifier for a much larger spectrum of parts.
Even so, this is far from perfect: it is slow, with every category you add you’re going to be doing more work in order to figure out which category a part is. The ‘best match’ can come from a library of parts which itself is growing so there is a nice geometrical element to the amount of computer time spent. Accuracy was quite impressive but in the end I abandoned this approach because of the speed (it could not keep up with the machine) and changed to the next promising candidate, an elimination based system.
The elimination system used the same criteria as the ones listed before. Sorting the properties in decreasing order of effectiveness allowed a very rapid elimination of non-candidates, and so the remainder could be processed quite efficiently. This was the first time the software was able to keep up with the machine running full-speed.
There are a couple of problems with this approach: once something is eliminated, it won’t be back, even if it was the right part after all. The fact that it is a rather ‘binary’ approach really limits the accuracy, so you’d need a huge set of data to make this work, and that would probably reduce the overall effectiveness quite a bit.
It also ends up quite frequently eliminating all the candidates, which doesn’t help at all. So, accuracy wasn’t fantastic and fixing the accuracy would likely undo most of the speed gains.
Tree based classification
This was an interesting idea. I made a little tree along the lines of the Animal Guessing Game. Every time you add a new item to the tree it will figure out which of the features are different and it will then split the node at which the last common ancestor was found to accomodate the new part. This had some significant advantages over the elimination method: the first is that you can have a part in multiple spots in the tree which really helps accuracy. The second is that it is lightning fast compared to all the previous methods.
But it still has a significant drawback: you need to manually create all the features first and that gets really tedious, assuming you can even find ‘clear’ enough features that you can write a straight up feature detector using nothing but OpenCV primitives. And that can get challenging fast, especially because python is a rather slow language and if your problem can’t be expressed in numpy or OpenCV library calls you’ll be looking at a huge speed penalty.
Finally! So, after roughly 6 months of coding up features, writing tests and scanning parts I’d had enough. I realized that there is absolutely no way that I’ll be able to write a working classifier for the complete spectrum of parts that Lego offers and that was a real let-down.
So, I decided to bite the bullet and get into machine learning in a more serious manner. For weeks I read papers, studied all kinds of interesting bits and pieces regarding Neural Networks.
I had already played with when they first became popular in the 1980’s after reading a very interesting book on a related subject. I used some of the ideas in the book to rescue a project that was due in a couple of days where someone had managed to drop a coin into the only prototype of a Novix based Forth computer that was supposed to be used for a demonstration of automatic license plate recognition. So, I hacked together a bit of C code with some DSP32 code to go with it and made the demo work, and promptly forgot about the whole thing.
A lot has happened in the land of Neural Networks since then, the most amazing thing is that the field almost died and now it is going through this incredible rennaisance powering all kinds of real world solutions. We owe all that to a guy called Geoffrey Hinton who just simply did not give up and turned the world of image classification upside down by winning a competition in a most unusual manner.
After that it seemed as if a dam had been broken, and one academic record after another was beaten with huge strides forward in accuracy for tasks that historically had been very hard for computers (vision, speech recognition, natural language processing).
So, lots of studying later I had settled on using TensorFlow, a huge library of very high quality produced by the Google Brain Team, where some of the smartest people in this field in the world are collaborating. Google has made the library open-source and it is now the foundation of lots of machine learning projects. There is a steep learning curve though, and for quite a while I found myself stuck on figuring out how to proceed best.
Within hours (yes, you read that well) I had surpassed all of the results that I had managed to painfully scrounge together feature-by-feature over the preceding months, and within several days I had the sorter working in real time for the first time with more than a few classes of parts. To really appreciate this a bit more: approximately 2000 lines of feature detection code, another 2000 or so of tests and glue was replaced by less than 200 lines of (quite readable) Keras code, both training and inference.
The speed difference and ease of coding was absolutely incredible compared to the hand-coded features. While not quite as fast as the tree mechanism accuracy was much higher and the ability to generalize the approach to many more classes without writing code for more features made for a much more predictable path.
The hard challenge to deal with next was to get a training set large enough to make working with 1000+ classes possible. At first this seemed like an insurmountable problem. I could not figure out how to make enough images and to label them by hand in acceptable time, even the most optimistic calculations had me working for 6 months or longer full-time in order to make a data set that would allow the machine to work with many classes of parts rather than just a couple.
In the end the solution was staring me in the face for at least a week before I finally clued in: it doesn’t matter. All that matters is that the machine labels its own images most of the time and then all I need to do is correct its mistakes. As it gets better there will be fewer mistakes. This very rapidly expanded the number of training images. The first day I managed to hand-label about 500 parts. The next day the machine added 2000 more, with about half of those labeled wrong. The resulting 2500 parts where the basis for the next round of training 3 days later, which resulted in 4000 more parts, 90% of which were labeled right! So I only had to correct some 400 parts, rinse, repeat… So, by the end of two weeks there was a dataset of 20K images, all labeled correctly.
This is far from enough, some classes are severely under-represented so I need to increase the number of images for those, perhaps I’ll just run a single batch consisting of nothing but those parts through the machine. No need for corrections, they’ll all be labeled identically.
I’ve had lots of help in the last week since I wrote the original post, but I’d like to call out two people by name because they’ve been instrumental in improving the software and increasing my knowledge, the first is Jeremy Howard, who has gone over and beyond the call of duty to fill in the gaps in my knowledge, without his course I would have never gotten off the ground in the first place, and second Francois Chollet, the maker of Keras who has been extremely helpful in providing a custom version of his Xception model to help speed up training.
Right now training speed is the bottle-neck, and even though my Nvidia GPU is fast it is not nearly as fast as I would like it to be. It takes a few days to generate a new net from scratch but I simply don’t think it is responsible to splurge on a 4 GPU machine in order to make this project go faster. Patience is not exactly my virtue but it looks as though I’ll have to get more of it. At some point all the software and data will be made open source, but I still have a long way to go before it is ready for that.
Once the software is able to reliably classify the bulk of the parts I’ll be pusing through the huge mountain of bricks, and after that I’ll start selling off the result, both sorted parts as well as old sets.
Finally, to close off this post, an image of the very first proof-of-concept, entirely made out of Lego:
One of my uncles cursed me with the LEGO bug, when I was 6 he gave me his collection because he was going to university. My uncle and I are relatively close in age, my dad was the eldest of 8 children and he is the youngest. So, for many years I did nothing but play with lego, build all kinds of machinery and in general had a great time until I discovered electronics and computers.
So, my bricks went to my brother, who in turn gave them back to my children when they were old enough and so on. By the time we reached 2015 this had become a nice collection, but nothing you’d need machinery for to sort it.
That changed. After a trip to lego land in Denmark I noticed how even adults buy lego in vast quantities, and at prices that were considerably higher than what you might expect for what is essentially bulk ABS. Even second hand lego isn’t cheap at all, it is sold by the part on specialized websites, and by the set, the kilo or the tub on ebay.
After doing some minimal research I noticed that sets do roughly 40 euros / Kg and that bulk lego is about 10, rare parts and lego technic go for 100’s of euros per kg. So, there exists a cottage industry of people that buy lego in bulk, buy new sets and then part this all out or sort it (manually) into more desirable and thus more valuable groupings.
I figured this would be a fun thing to get in on and to build an automated sorter. Not thinking too hard I put in some bids on large lots of lego on the local ebay subsidiary and went to bed. The next morning I woke up to a rather large number of emails congratulating me on having won almost every bid (lesson 1: if you win almost all bids you are bidding too high). This was both good and bad. It was bad because it was probably too expensive and it was also bad because it was rather more than I expected. It was good because this provided enough motivation to overcome my natural inertia to actually go and build something.
And so, the adventure started. In the middle of picking up the lots of lego my van got stolen so we had to make do with an elderly espace, one lot was so large it took 3 trips to pick it all up. By the time it was done a regular garage was stacked top-to-bottom with crates and boxes of lego. Sorting this manually was never going to work, some trial bits were sorted and by my reckoning it would take several life times to get that all organized.
Computer skills to the rescue! A first proof of concept was built of - what else - lego. This was hacked together with some python code and a bunch of hardware to handle the parts. After playing around with that for a while it appeared there were several basic problems that needed to be solved, some obvious, some not so obvious. A small collection:
fake parts needed to be filtered out
There is a lot of fake lego out there. The problem is that fake lego is worth next to nothing and if a fake part is found in a lot it devalues that lot tremendously because you now have to check each and every part to make sure you don’t accidentally pass on fake lego to a customer.
Lego is often assembled as a set and then put on display. That’s nice, but if the display location is in the sun then the parts will slowly discolor over time. White becomes yellow, blue becomes greenish, red and yellow fade and so on. This would be fairly easy to detect if it wasn’t for the fact that lego has a lot of colors and some of the actual colors are quite close to the faded ones.
Not all Lego is equally strong, and some parts are so prone to breakage it is actually quite rare to find them in one piece. If you don’t want to pass on damaged parts to customers you need to have some way of identifying them and picking them out of the parts stream.
Most Lego that was bought was clean, but there were some lots that looked as if someone had been using them as growth substrate for interesting biological experiments. Or to house birds…
feeding lego reliably from a hopper is surprisingly hard
Lego is normally assembled by childrens hands, but a bit of gravity and some moving machine parts will sometimes do an excellent job of partially assembling a car or some other object. This tendency is especially pronounced when it comes to building bridges, and I’ve yet to find a hopper configuration wide and deep enough that a random assortment of Lego could not form a pretty sturdy bridge across the span.
The current incarnation uses a slow belt to move parts from the hopper onto a much faster belt that moves parts past the camera.
Scanning parts seems to be a trivial optical exercise, but there are all kinds of gotchas here. For instance, parts may be (much!) longer than what fits under the camera in one go, parts can have a color that is extremely close to the color of the background and you really need multiple views of the same part. This kept me busy for many weeks until I had a setup that actually worked.
Once you can reliably feed your parts past the camera you have to make sense of what you’re looking at. There are 38000+ shapes and there are 100+ possible shades of color (you can roughly tell how old someone is by asking them what lego colors they remember from their youth). After messing around with carefully crafted feature detection, decision trees, bayesian classification and other tricks I’ve finally settled on training a neural net and using that to do the classification. It isn’t perfect but it is a lot easier than coding up features by hand, many lines of code, test cases and assorted maintenance headaches were replaced by a single classifier based on the VGG16 model but with some Lego specific tweaks and then trained on large numbers of images to get the error rate to something acceptable. The final result classifies a part in approximately 30 ms on a GTX1080ti Nvidia GPU. One epoch of training takes longer than I’m happy with but that only has to be done once.
distributing parts to the right bin
This also was an interesting problem, after some experimenting with servos and all kinds of mechanical pushers the final solution here was to simply put a little nozzle next to the transport belt and to measure very precisely how long it takes to move a part from the scan position to the location of the nozzles. A well placed bin then catches the part.
Building all this has been a ton of fun. As I wrote above the prototype was made from Lego, the current one is a hodge-podge of re-purposed industrial gear, copious quantities of crazy glue and a heavily modified home running trainer that provides the frame to attach all the bits and pieces to.
Note that this is by no means finished but it’s the first time that all the parts have come together and that it actually works well enough that you can push kilos of Lego through it without interruption. The hopper mechanism can still be improved a lot, there is an easy option to expand the size of the bins and there are still obvious improvements on the feeder. The whole thing runs very quiet, a large factor in that is that even though the system uses compressed air the compressor is not your regular hardware store machine but one that uses two freezer motors to very quietly fill up the reserve tank.
Here is a slow run tracking some parts so you can see how all the bits work together (it can run much faster):
A faster run, still slow enough that you can hopefully see what is going on:
I grew up in Amsterdam, which is a pretty rough town by Dutch Standards. As a kid there are all kinds of temptations and peer-pressure to join in in bad stuff is something that is hard to escape. But somehow that never was a big factor for me, computers and electronics kept me fascinated for long enough that none of that ever mattered. But being good with computers is something that sooner or later also is something that you realize can be used for bad.
For me that moment came when one of my family members showed up at my combined house-office in the summer of 1997. The car he drove was a late model E-Class Mercedes. This particular family member has a pretty checkered history. When I still lived with my mom as a kid he would show up once or twice every year, unannounced and would comment on our poor condition and would give me a large bill to go to the night store and get luxury food. Salmon, French cheese, party time. Always flashing his success and mostly pretending to be wealthy. He vowed he’d pay for my driving license which is a big deal here in NL, that costs lots of money, but then never did. This was fine by me, I could easily pay for it myself but it didn’t exactly set the stage for a relationship of trust. Also, in the years prior to this I had never seen or heard from him.
What had changed was this: a few weeks prior to the visit there had been a large newspaper article about me and one of the things that it mentioned was my skills with computers. And this must have been the reason that my family member decided that those skills were undervalued by the marketplace and I needed a bit more in terms of opportunities.
So here was his plan: he’d bring me one of those cars every week. I could drive it as long as I made sure that when it went back to him it would have 200,000 kilometers less on the counter than what it had when he brought it. Every car would come with 5000 guilders in the glove compartment, mine to keep. Now, I’m sure that this is a hard thing to relate to, but when your family, even if you hardly ever see them shows up and makes you a proposition you can’t just tell them to fuck off. Especially not when they’re dangerous people. So I had a real problem, there was no way I was going to do this but saying no wasn’t simple either.
The backstory to this is that those cars were taxis which had been used intensively in the two years that they were old and that their market value as low mileage cars was much higher than their market value with 200K+ on them.
In the end I clued in on the fact that my family member needed me because he was clueless about the difficulty factor involved. And in fact, with my love for puzzles that was the one thing that caused an itch somewhere at the back of my mind: could I do it? Interesting hack, not because it was worth a lot of money. But this also offered me an easy out: I would simply tell him that I couldn’t do it. There is no way that he would be able to know one way or another whether or not I was lying or not. Yes, 5000 guilders per week was (and still is, though we use the Euro now) a boatload of money. And they’re nice cars. But some lines you just don’t cross.
Because what I could easily see is that this would be a beginning, and a bad beginning too. You can bet that someone somewhere will lose because of crap like this. (Fortunately, now the EU has made odometer fraud illegal). You can also bet that once you’ve done this thing and accepted the payment that you’re on the hook. You are now a criminal (or at least, you should be) and that means you’re susceptible to blackmail. The next request might not be so easy to refuse and could be a lot worse in nature. So I wasn’t really tempted, and I always felt that ‘but someone else will do it if I don’t’ was a lousy excuse.
If you’re reading this as a technical person: there will always be technically clueless people who will attempt to use you and your skills as tools to commit some crime. Be sure of two things: the first is that if the game is ever up they’ll do everything they can to let you hold the bag on it and that once you’re in you won’t be getting out that easily.
It must have seemed like a good idea at the time. Facing a sizable fraction of his own party that wanted to secede from the EU David Cameron made the gambit of the century: Let’s have a referendum and get this behind us once and for all. He never for one second thought that the ‘leave’ faction would be able to win that referendum and the end result would be to cement his own position for at least another election cycle to come. Alas, for everybody involved, we now know this was an extremely costly mistake.
Amidst claims of regret and being duped the UK population is rocked by the impact of what they’ve done, but even if everybody that wanted to would be allowed to ‘switch sides’ and vote again the ‘leave’ camp would still win, but by a smaller margin.
There are a number of driving forces behind the ‘brexit’ vote, and as I watched the whole thing unfold from my (Dutch, and so EU) vantage point I tried to make a small catalog of them without assigning them any relative weights.
The EU government is spectacularly out of touch with its subjects and does a very poor job of communicating the pluses and the minuses of being part of the union. As one of those subjects, and fairly politically informed, it always amazes me how opaque ‘Brussels’ is to those that would like to know how it all functions and what options we as ordinary citizens have to influence the proceedings outside of the votes we cast. There are veritable mountains of documents about the EU, but there is no relatively accessible piece of information that gives a person with average education an idea of how it all works and what the tools at hand are. The EU is generally viewed as a cost without upside (and the main upside is that the EU is much more stable than the countries that it unites), a net negative and a draw rather than a benefit. The fact that Brussels diplomats routinely take compensation without any performance whatsoever and that corruption is perceived as being wide-spread doesn’t help either. In general, EU politics are far away from the voters boots on the ground. This is as much a real problem as one of communications and can’t be solved easily.
The UK, a former world power, has seen its position marginalized further and further over the last 5 decades. An older generation hankers back to the days long gone and would like to see Great Britain to be restored to its former glory. This is understandable, but in my opinion somewhat mis-informed. The world is a much more connected place today than it was 50 years ago and next to a unified EU with the UK as an outsider (and, if we are to believe the latest developments with England as an outsider) it is not a very important country economically. The EU is a very large economic entity and to negotiate with 27 countries individually the UK of the past had formidable clout but today the situation has changed very much and turning back the clock like this simply isn’t going to work.
Immigration, always a hot topic when things are not going well. The UK has its share of immigration issues, just like the rest of Europe. Unlike most of the rest of Europe, as an island there is the illusion that the physical borders are insulation against the issues that the rest of Europe struggles with as soon as the subject is the free movement of people. Right or wrong, it doesn’t matter, there are a lot of people in the UK that feel that ‘the foreigners took their jobs’, or that refugees are the kind of people that there simply isn’t room for. It’s a tough problem, but I highly doubt that this problem is large enough to isolate a country over from its main trade partners. On the one hand, there definitely is some truth to the downward pressure on wages from cheap competition (so when this affects you directly your vote for ‘exit’ is probably in the bag), on the other, a large influx of people that are most likely not going to be net contributors to the economy isn’t going to help either. But, and this is the bigger issue, exiting the EU will come with the requirement to re-negotiate a whole pile of treaties and the EU is most likely simply going to make all the same things that were tough to swallow pills in the past bargaining chips. And this time the UK (or what’s left of it) will not be in a position to refuse much of anything. So I highly doubt that this subject will be resolved through an exit of the UK from the EU.
Automation: Unlike immigrants vying for the jobs traditionally held by UK born blue collar workers (many of them second generation immigrants themselves) the automation wave of the last 30 years has done as much or more to damage the prospects of those that do not have a high level of education, and those that do not work in the immediate vicinity of a large population center. More and more jobs disappear through automation in almost every branch of industry. This has led to record un-employment and governments the world over (including the UK) are struggling with how to deal with this. For a laid off factory or agricultural worker it does not matter what the underlying reason for being jobless is, the frustration with the establishment to whom they would look to solve this is definitely understandable.
General protest votes against those in power seem to me to make up the remainder of the group that voted for the exit, and quite a few of those are now in the un-enviable position of having received what they wished for, a country whose leadership has already started infighting and which - to me as an outsider at least - appears to be utterly rudderless, which for a former seafaring giant is a very bad position to be in.
If the UK were a boat, it would appear as if the captain had descended into the hold with an axe and had made a giant hole in the bottom of the boat to prove that it can’t be sunk. Fortunately the UK is an island and literally sinking it is an impossibility, but the damage done dwarfs anything I’ve seen a political entity ever do to their own country.
The really puzzling thing about the composition of the ‘leave’ voters is that a very large number of them stand to be positioned squarly in the way of the blow that will land on the UK economy once the exit is a fact. I can see ‘change for change’s sake’ as an option but when it is all but a certainty that your own position will come out much worse it makes me wonder if the consequences have been thought through.
Junker & co are happy to finally kick the naughty kid out of the class, and even though I understand their position I’d like to caution them not to be too rash, it’s just another example of the EU doing what it does best: to decide without any visible kind of proces behind the decision, and I don’t recall voting for Juncker. For one a very large chunk of the UK voted ‘remain’ and to push the UK to exit too fast could very well alienate this extremely important faction within the UK, for another, it would appear that France and Germany would like to see the UK cut up into pieces or to no longer be a factor of note in EU politics so they can drive their plans forward unimpeded.
The damage is done, I for one would very much like to see restraint on the part of the EU leadership on how they deal with the self-inflicted crisis in the UK and to limit the damage where possible. If the UK loses some of its special status then that would be acceptable, but to push the UK out when it may be possible to retain it - or a large fraction of it - through some kind of compromise would be a mistake worthy of a Cameron, and we already know how that ended.
About a week ago I bought a really nice solid wooden table for a song in a second hand store. Hauling it home was quite the job, the thing weighs a ton. Fortunately the legs came off otherwise I’d still be standing in front of the staircase with it. After two days of working I noticed that the pinky and ring finger on my left hand felt numb and wouldn’t move the way they normally do. This is worrisome, my ability to type is part of my bread and butter and when it got worse the next day I figured I should do something about it.
Numbness usually indicates something is not ok with the nervous system so I started googling for what it could be and after a while I found this wikipedia article on Ulnar Nerve Entrapment. Which pretty much matched all my symptoms. And so I ended up suspecting that my nice shiny old-but-new-for-me table may be the culprit.
And it turns out this is the case! Because of the height of the table the angle at which my arms rest against it is a little bit different than what I’m used to, which causes the weight of my arms to rest partially on my wrists. That’s pretty normal, but instead of resting on a surface they now rest against the edge of the table. It doesn’t feel unpleasant so I was’t worried about it at all but it just so happens that right at that spot the nerve passes relatively unprotected and this mild pressure is enough to pinch the nerve.
So, heighten the chair and the symptoms are already getting less. What surprises me is that I never had any of this before and the speed with which it got worse. Let’s hope the recovery will be complete and that it will continue at the same speed. So, if you have a numbness in your pinky and index finger better check the height of your table or, if you are a cyclist, how your arms rest against the handlebars if you have those fancy curved ones.