Jacques Mattheij

Technology, Coding and Business

Sorting 2 Tons of Lego, Many Questions, Results

For part 1, see here. For part 2, see here


The machine is now capable of running un-attended for hours on end, which is a huge milestone. No more jamming or other nastiness that causes me to interrupt a run. Many little things contributed to this, I’ve looked at all the mechanics and figured out all the little things that went wrong one by one and have come up with solutions for them. Some of those were pretty weird, for instance very small Lego pieces went off the transport belt with such high velocities that they could end up pretty much everywhere, the solution for this was to moderate the duration of the puff relative to the size of the component and to make a skirt along the side of the transport belt to make sure that pieces don’t land on the return side of the belt which would cause them to get caught under the roller.


The machine is now twice as fast, this due to a pretty simple change, I’ve doubled the number of drop-off bins from 6 to 12, which reduces the number of passes through the machine. It now takes just 3 passes to get the Lego sorted in the categories below. This required extending the pneumatics with another 6 valves, another manifold and a bunch of wiring and tubing, the expanded controller now looks like this:

I’ve also ground off the old legs from the base (the treadmill) and welded on new ones to give a little bit more space to accomodate the new bins, but that’s pretty boring metal work.


The image database is now approximately 60K images of parts, this has had a positive effect on the accuracy of the recognition, fairly common parts now have very high recognition accuracy, less common parts reasonably high (> 95%), rare parts are still very poor but as more parts pass through the machine the training set gets larger and in time accuracy should improve further. Judging by the increase in accuracy from 16K images to 60K images there is something of a diminishing rate of return here and it will likely be that by the time we reach 98-99% accuracy there will be well over a million images in the training set.


I’ve reduce the image size a bit in the horizontal dimension, from 640 pixels wide to 320 wide. Now that I have more data to work with this seems to give better results, the difference isn’t huge but it is definitely reproducible. The RESNET50 network still seems to give the best compromise between accuracy and training speed. I’ve added some utilities to make it easier to merge new samples into the existing dataset, to detect (and correct) classification errors and to drive the valve solenoids in a more precise way to make sure that valves reliably open and close for precisely determined amounts of time. This also helps a lot in making sure that parts don’t shoot all over the room.


Overall the machine works well but I’m still not happy with the hopper and the first stage belt. It is running much slower than the speed at which the parts recognizer works (30 parts / second) and I really would like to come up with something better. I’ve looked at vibrating bowl feeders and even though they are interesting they are noisy and tend to be set up for one kind of part only. They’re also too slow. If anybody has a bright idea on how to reliably feed the parts then please let me know, it’s a hard problem for which I’m sure there is some kind of elegant and simple solution. I just haven’t found one yet :)


The project has had tremendous coverage from all kinds of interesting publications, IEEE Spectrum had an article, lots of internet based publications listed or linked it (for instance: Mental Floss, Endgadget, Mashable). If you have published or know about another article about the sorter please let me know and I’ll add it to the list.

Of course I have this totally backwards

Starting by buying Lego, sorting it and then thinking about how to best sell the end result is the exact opposite of how you should approach any kind of project, but truth be told I’m in this far more for the technical challenge than for the commercial part. That leaves me with a bit of a problem: The sorter is working so well now that I am actually sitting on piles of sorted Lego that would probably make someone happy. But I have absolutely no idea if the sort classes that I’m currently using are of interest.

So if you’re a fanatical Lego builder I’d very much like to hear from you how you would like to buy Lego parts in somewhat larger quantities, say from 500 Grams (roughly one pound) and up. Another thing I would like to know is where you’re located so I can figure out shipping and handling.

As you can see right now the sorting is mostly by functional groups, slopes with slopes, bricks with bricks and so on. But there are many more possibilities to sort lego, for instance by color. Please let me know in what quantity and what kind of groupings would be the most useful to you as a Lego builder.

My twitter is at twitter.com/jmattheij and my email address is jacques@mattheij.com, here are some pictures of what the current product classes look like, but these can fairly easily be changed if there is demand for other mixes.



Space and Aircraft:


Wedge plates:

Vehicle parts:


1 wide bricks:

1 wide plates, modified:

1 wide plates:

Hinges and couplers:

Minifigs and minifig accessories:

2 wide plates:





Plates 6 wide:

Plates 4 wide:


Bricks 2 wide:

Doors and windows:

Construction equipment:



Bricks 1 wide, modified:

Macaroni pieces:

Corner pieces:





Helicopter blades:

Stepped pieces:

HN Submission/Discussion
If you read this far you should probably follow me on twitter: