The short answer is, yes you can. The hardware is compatible with it, and the 0.2 release of the software adds some portability fixes and extra functionality to make that much easier to do. It will work with both 32 and 64-bit Windows versions.
The longer answer is that it depends a bit on exactly what you want to do.
With Linux and the BSD's, you have the option to both read random bits directly out of the BitBabbler devices for use in your own application, and to integrate it with the native kernel support so that all applications can benefit from the entropy it provides without any modification. And you can do both simultaneously if that is what you need.
With Windows, it's a bit more complex than that. While it is possible to add it as an entropy source for the native CryptoAPI, it's not clear that there is any guarantee of what the CryptoAPI will actually do with it, or how much weight it will give it when obtaining a new seed, or how often it will use it to reseed its PRNG. The official documentation on this is pretty thin, and as a black-box implementation, for all we know it could simply just discard everything we feed it.
So we haven't had much interest so far from Windows users wanting to go that way. Instead, they are reading entropy from the device directly for use in their applications. The software we provide allows applications to request entropy on demand from a socket interface, which means it can either run natively on the Windows machine itself, or be run entirely separately on a small secure ARM device or similar which provides entropy over a network connection (and we currently have users doing both those things).
Making this useful for you is about supporting what you need to solve your problems though, so if you need a strong entropy source on Windows machines, and need something different to that, then we should talk more about that too.
Yes, at the time of writing it's been tested successfully on the El Capitan and Sierra releases. It should work without further changes on earlier releases too, and shouldn't need any major fixes for future releases.
However if you are thinking of doing strong crypto on a Mac, you may first want to read about what we found in the MacOS kernel, regardless of whether you plan to use a BitBabbler or any other TRNG there.
In principle, we're more than happy to support any of them which people want to use. In practice, we can currently report that we know it's being successfully used by other people on OpenBSD 6.1 and 6.2, and on FreeBSD 10 - 13, and have tested it ourselves on all of those releases too.
We'll gladly add to that list as people either report success on their preferred platform too, or report any problem which does needs fixing for one they want to use. At this stage, any remaining issues on other BSD varieties are likely to be fairly trivial to fix.
Yes, this was one of the initial use cases that was important for us too, so we definitely want this to be well supported.
If you only have a single BitBabbler device that you want to bind to a single VM then this is really easy to do. If you have multiple devices that you want to divide between multiple VMs, then it gets a bit more involved to set up, but there is detailed documentation included in the software package for how to do that too. Mostly it's just a case of ensuring the host and the guest both agree on which device(s) should belong to which VM, and which, if any, remain for the use of the host machine.
So far we've mainly focussed on doing this with KVM/QEMU and libvirt, but any VM that allows USB passthrough should be able to do this in a similar manner. There are a couple of things that libvirt could support better which would make the multiple device case easier to configure, and sending them patches for that is on our todo list if nobody else gets to that before us.
Update: we now have an even better way to do this, though there is still room to improve its implementation further with better support on the libvirt side of things.
Yes, MATLAB natively supports importing data from a UDP socket, and our software supports exporting packets of entropy via a UDP socket, so importing arbitrary amounts of entropy into MATLAB on demand is a trivial thing to do.
This is something we could definitely still improve on, even if just with some easy to follow documentation to assist people with setting that up, but for now probably the best bet is to look at building a custom version of Tails with the BitBabbler software package included. That should be fairly straightforward, but you'll need to follow their documentation on the process for that.
We think this is an important use case, even if it wasn't originally on our own list of needs, so making this easier however we can is certainly something we are keen to see happen.
`dd if=/dev/random of=test bs=1024 count=1`
give me less than 1024 bytes of
output, even when /proc/sys/kernel/random/entropy_avail
shows much more than
that, and why does it give me a different number of bytes each time I run it?
The simple answers here are that entropy_avail
actually reports the
number of bits in the input pool, and what you probably wanted to run
instead of this was something more like either one of:
dd if=/dev/random of=test bs=1 count=1024
dd if=/dev/random of=test bs=1024 count=1 iflag=fullblock
Which would give you the full 1024 bytes you wanted.
But the real answer to why you get some oddball number of bytes each
time is a bit more involved and a bit less intuitive. People who are already
familiar with the details of dd
will know that
bs=1024 count=1
tries to read a single block of up to 1024
bytes, and will return less than that if the full amount is not immediately
available, – but that doesn't explain why it still might return considerably
less than what entropy_avail
would seem to indicate is available.
To understand that part, you need to look at how the Linux kernel maintains its
three separate entropy pools. What entropy_avail
reports is the
number of bits that are credited to the input pool, which is what we fill
from the BitBabbler when its watermark falls below the wakeup point and what gets
filled by the other sources of entropy that the kernel collects.
A read from /dev/random
however, will take bits from the
blocking pool, so the number of bytes that may be returned in a single
read is limited by what is currently remaining in that. It will be refilled in
multiples of a constant block size by mixing in fresh entropy from the input
pool, but if it has some bits available it will only return what it already
has before doing so, and there is no external interface to monitor what that number
currently is. If you only read a small number of bits from it, then it may not
even need to take any bits from the input pool, or reduce what
entropy_avail
reports, at all. So the number of bytes that a
single read returns is not directly related to how much entropy is
currently available overall, only to how much was already read from the blocking
pool since the last time it was refilled. As long as there is still more entropy
available in the input pool, a second read attempt immediately after the first one
will immediately return more bytes. The results you get just look confusing
because the two numbers you are looking at are not actually directly related to
each other at all.
[ The third pool is the non-blocking pool, which operates in a similar
way but is used to feed /dev/urandom
, which will continue to return
bits from the kernel CSPRNG even when there is no more entropy available in the
input pool to reseed it. ]
/dev/random
seem slower than I expected?
There are several reasons for this. But the important one to understand is that
the /dev/random
interface was never intended for streaming large
amounts of random bits out of it quickly. It was designed to return small amounts
of high quality seed material for a CSPRNG. To quote from the
random(4)
man page:
The kernel random-number generator is designed to produce a small amount of high-quality seed material to seed a cryptographic pseudo-random number generator (CPRNG). It is designed for security, not speed, and is poorly suited to generating large amounts of random data. Users should be very economical in the amount of seed material that they read from
/dev/urandom
(and/dev/random
); unnecessarily reading large quantities of data from this device will have a negative impact on other users of the device.The amount of seed material required to generate a cryptographic key equals the effective key size of the key. For example, a 3072-bit RSA or Diffie-Hellman private key has an effective key size of 128 bits (it requires about 2128 operations to break) so a key generator only needs 128 bits (16 bytes) of seed material from
/dev/random
.While some safety margin above that minimum is reasonable, as a guard against flaws in the CPRNG algorithm, no cryptographic primitive available today can hope to promise more than 256 bits of security, so if any program reads more than 256 bits (32 bytes) from the kernel random pool per invocation, or per reasonable reseed interval (not less than one minute), that should be taken as a sign that its cryptography is not skillfully implemented.
So in line with this recommended use, by default we are also quite conservative about how we feed fresh entropy to the kernel input pool. When it falls below its watermark and wakes us to refill it, we read 20,000 bits out of the BitBabbler pool which is a size chosen to let us run a full FIPS 140-2 analysis on the block before it is passed to the kernel (as well as the other QA testing that is constantly done on bits read from the device). Since the kernel will credit us for at most 4096 bits of entropy, we then fold those 20,000 bits twice, down to a block of 5000 (which is again fed back into the continuous QA testing for further verification) before passing them to the kernel.
Since in most cases the input pool will not actually be completely empty when this occurs, it effectively means we may only be getting credited for 1000 or 2000 bits of new entropy for each 20,000 bits read from the device once the kernel input pool is again at its maximum capacity of 4096 bits. While this may seem at first glance like a rather large factor of 'wasted' potential bitrate, in practice it means we are actually using bits from the device that would otherwise really be wasted by just not being used at all.
Even with the worst case multiplier for credit per bit, and the most bit-hungry
end of the recommendation for using /dev/random
above, there will
still be enough fresh entropy being fed in for over 400 simultaneous threads to all
be reseeding their CSPRNG every second. Once you also add the recommendation to
not reseed more often than once per-minute, that means a single device is capable
of supplying enough entropy for around 25,000 crypto-using application threads in
this conservative default configuration. And with two BitBabbler devices, twice
that number can be easily supported, since the output rate can multiply almost
linearly (which we've tested for up to 60 devices in a single machine).
It would be quite possible for us to adopt a less conservative configuration for
this, which would increase the proportion of bits that we will actually be credited
for, but so far that hasn't seemed necessary for any practical real use case. If
you want raw entropy at the full rate that the BitBabbler is capable of, the best
way to get it is to read it directly from the BitBabbler pool without going through
the overhead of the /dev/random
device (since there are other factors
of its implementation which also make it inherently slow no matter how fast we
might feed it new bits).
If you have a real use case which doesn't fit with these assumptions, for
whatever reason, then we should definitely talk about that and look at making this
more flexible to suit, but for the moment this mostly seems to just surprise people
who try to measure the maximum rate of bits they can read from
/dev/random
rather than having any adverse effect on ensuring the
system always has sufficient entropy available to be read from that device for
normal use. A high rate of bits is good, but using them all as wisely as we can
for normal use
scenarios is even better.
/dev/urandom
?
Yes, for many uses you can. But something still has to initialise it with a good random seed if it's to be actually secure. A good CSPRNG can generate large amounts of random numbers from a much smaller amount of initial entropy, but ultimately it's only as unpredictable as the entropy which you seed it with.
Asking if a TRNG is better
than a well studied and widely trusted CSPRNG
is asking the wrong question. Almost any practical and ostensibly secure system
will need and want to to make use of both. A TRNG to provide a reliable source for
the unpredictable seed material, and a CSPRNG to quickly amplify that into longer
strings of random output. Good CSPRNGs are thought to be well known, and are
widely available. It was the problem of ensuring we had good seed material for
them which was not so easy to be happy with, and which the BitBabbler project aimed
to solve in a more thorough and convincing manner than what we were seeing as the
existing norm.
Unless you're doing research which requires large amounts of purely non-deterministic randomness, or have other special needs, the BitBabbler isn't a replacement for a CSPRNG, it's a complementary tool which performs a job that is essential for using one securely.
While you can use a BitBabbler alone to generate large quantities of good random data quickly, you can't securely use a CSPRNG without some mechanism to generate the genuinely unpredictable seed material it requires.
It's fairly pointless to speculate about that here. The more plainly obvious problem with it for our purposes is that it is simply impossible to actually verify their generator (on any given chip) is even statistically sound, since you can only examine the bits after they have been cryptographically whitened. It doesn't matter if it's been subverted if it's not even possible to prove that it isn't just plain vanilla bad. A simple counter could be whitened in exactly the same way to produce results which are indistinguishable to statistical testing from the ones it creates.
And besides, why mess about trying to be tricky with RDRAND when you already have a red carpet tradesmen's entrance in your chip like this.
The bottom line is, the more independently verifiable sources you can collect entropy from and mix together securely, the more robust you can be if one or more of them fails catastrophically, whether that failure is by accident or design.
We're engineers, not philosophers. And while we do find the whole gamut of questions surrounding free will, and whether spooky action at a distance is real or not, to be incredibly interesting and worthy of long late-night discussions and mind-expanding musing – the concept of a True Random Number Generator is a timeworn term of art used to describe devices which obtain their randomness from unpredictable physical events (like the BitBabbler does).
Whatever the ultimate truth may be as to just how omniscient you would need to be before they stop being unpredictable, this distinguishes them as a group from deterministic mathematical algorithms which also create a statistically random distribution but which are entirely predictable (even to mortals who can do basic math) if you know the internal state of the algorithm, and which in turn are described as Pseudo Random Number Generators.
We're sticking with the long established and accepted terminology here, because that's how we can have sensible conversations with other people who are also familiar with the current state of the art. There are far more interesting problems to explore in this space than trying to coin new jargon for it.