Tempting... I could do it... part of my brain is wondering if I'd need such code myself, not in the next six months at least. Might be an interesting project to implement it in the new job system, the secondary physics engine can sit in it's own thread and trundle along at it's own pace. That should take a lot of the strain off the main physics engine when dealing with a ton of small objects. There's going to be some difficulty passing data about colliders in and out of the second thread but not impossible.
It would also make sense to wait until unity provide a struct base math library although I can't see that being much longer, it's possibly the single most useful library they could port. My implementation is more than a little crude, no point getting it perfect when a unity engine developer is getting paid to do it anyway.
Do you plan on posting the code to github?