21.07.2021 at 07:46 am

Basis for Base64

Reductionist SQLite storage.

I've lately been storing generated binaries and small files in SQLite via base64 text and BLOB fields.

I note the claims by Django folks that it's 'not good practice' (which appears quite contrary to formal experimental verification, and possibly untrue unless you're network-bound).

Arguably they have a point with BLOBs. But I don't think I'll agree with the arguments against base64 storage: hey, it's just text. And I like it. It reduces filesystem management and I get a relational framework to manage metadata, rather than having to rely on the OS's filesystem, which often feels slow (especially in Windows).

This was also an interesting read on the history of Base64:

Your first mistake is thinking that ASCII encoding and Base64 encoding are interchangeable. They are not. They are used for different purposes.

To understand why Base64 was necessary in the first place we need a little history of computing. ...

Originally a lot of different encodings were created (e.g. Baudot code) which used a different number of bits per character until eventually ASCII became a standard with 7 bits per character. However most computers store binary data in bytes consisting of 8 bits each so ASCII is unsuitable for tranferring this type of data. Some systems would even wipe the most significant bit. Furthermore the difference in line ending encodings across systems mean that the ASCII character 10 and 13 were also sometimes modified.

To solve these problems Base64 encoding was introduced. This allows you to encode arbitrary bytes to bytes which are known to be safe to send without getting corrupted (ASCII alphanumeric characters and a couple of symbols). The disadvantage is that encoding the message using Base64 increases its length - every 3 bytes of data is encoded to 4 ASCII characters.

To send text reliably you can first encode to bytes using a text encoding of your choice (for example UTF-8) and then afterwards Base64 encode the resulting binary data into a text string that is safe to send encoded as ASCII. The receiver will have to reverse this process to recover the original message. This of course requires that the receiver knows which encodings were used, and this information often needs to be sent separately.

Filed under:
Words: 414 words approx.
Time to read: 1.66 mins (at 250 wpm)
, , , , , , , , ,

Other suggested posts

  1. 10.06.2022 at 07:44 pm / Teach Thy Tongue to Say: 'I Do Not Know'
  2. 08.06.2022 at 04:13 pm / Counsel vs Client Perspectives
  3. 08.02.2016 at 12:00 am / Eye to Eye (Jonathan Young's Version)
  4. 15.06.2015 at 12:00 am / Supposedly Cultivated Tastes
  5. 31.05.2015 at 12:00 am / Judges Should Not Fear Criticism
  6. 26.03.2012 at 12:00 am / Apricot Orange Sunsets
  7. 15.03.2012 at 12:00 am / Cel-Shaded Arena
  8. 26.08.2010 at 12:00 am /
  9. 16.08.2010 at 12:00 am / Lost Nuances in 'Ivan the Terrible'
© Wan Zafran. See disclaimer.