[OMEMO] sanitize BLOB values written by buggy code in python2 version of plugin
- Gajim git master 6ef3b8e01b2b01
- OMEMO plugin 2.6.52
Follow-up for #399 (comment 192608), related: #278 (closed), #389 (closed)
I've investigated this issue during last weeks and it looks like it is not a bug in python or sqlite but a bug in the OMEMO plugin, which shows in its python2 flavor.
Here are test scripts:
test-blob-write-buggy-using-py2.py
test-blob-read-buggy-using-py2.py
This illustrates the issue, when keys have been written using 0.16.x Gajim (python2) and then accessed with 1.x Gajim (python3):
$ rm -f test.db
$ python ./test-blob-write-buggy-using-py2.py
DONE
$ python ./test-blob-read-buggy-using-py2.py
OK
$ python3 ./test-blob-read-buggy-using-py2.py
FAILED
To override a run-time failure running in python2:
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.
the following has been added:
con.text_factory = bytes
Instead: in python2 the variable holding BLOB data should be passed as:
buffer(public_key)
Then the content is saved properly as BLOB not as string.
Writing using python2:
$ rm -f test.db
$ python ./test-blob-write.py
DONE
$ python ./test-blob-read.py
OK
$ python3 ./test-blob-read.py
OK
and writing using python3:
$ python3 ./test-blob-write.py
DONE
$ python ./test-blob-read.py
OK
$ python3 ./test-blob-read.py
OK
This is an example how to sanitize the existent DB (fetching with explicit cast to BLOB) using python3:
import sqlite3
con = sqlite3.connect("test.db")
query = '''SELECT some_id, CAST(public_key as BLOB) FROM identities'''
results = con.execute(query).fetchall()
for result in results:
some_id = result[0]
public_key = result[1]
con.execute('UPDATE identities SET public_key = ? WHERE some_id = ?', (public_key, some_id))
con.commit()
con.close()
Testing:
$ python ./test-blob-write-buggy-using-py2.py
DONE
$ python3 ./test-blob-read.py
FAILED
$ python3 ./test-blob-sanitize.py
$ python3 ./test-blob-read.py
OK
Before sanitizing:
$ echo ".dump identities" | sqlite3 -readonly test.db
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE identities (some_id INTEGER, public_key BLOB);
INSERT INTO identities VALUES(123,replace(���!Np��:,�%\nU&�!(� S�܉�ÿͽ�v �','\n',char(10)));
COMMIT;
after:
$ echo ".dump identities" | sqlite3 -readonly test.db
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE identities (some_id INTEGER, public_key BLOB);
INSERT INTO identities VALUES(123,X'08a29cb1071221054e70a5e71a3a2cdf250a55268721281b8f9556a20953b9dc89c5c3bfcdbd99761a20880015acbfa1517b91044f4e260c38d62e900f2c7ed9a2dc76bd12da9d925070');
COMMIT;
It is obvious, that binary data should be saved indeed only as BLOB not as TEXT/string, otherwise it wouldn't be properly treated by conventional tools.