|
From: | Dmitry Gutov |
Subject: | Re: sqlite3 |
Date: | Tue, 14 Dec 2021 20:32:59 +0300 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 |
On 14.12.2021 19:43, Lars Ingebrigtsen wrote:
Dmitry Gutov <dgutov@yandex.ru> writes:But a "proper" database might give other advantages like a faster search in the loaded data (unless it's already "indexed" by using hash tables everywhere where they could be used). Or being able to read the data without loading the whole file into memory. Which, for certain scenarios and data sets, might be a bigger advantage than faster writes.Here's a matrix of advantages and disadvantages to three approaches: sqlite, one-file-per-value, and one-file-with-a-hash-table-with-several/all-values:sqlite files hashRead/write value speed ⚄ ⚅ ⚀ Read/write value mem ⚅ ⚄ ⚀ List all values speed ⚅ ⚀ ⚅ List all values mem ⚃ ⚁ ⚃ Ease of moving around ⚄ ⚀ ⚅
I'm not 100% sure how to interpret (is a higher value for "mem" better or worse?), but it seems like, at least, for the original scenario of having large data sets sqlite might still be optimal.
But it turns out that sqlite3 is actually slower for this particular use case than just writing the data to a file (i.e., using the file system as the database; one file per value). So multisession.el now offers two backends (`files' and `sqlite'), and defaults to `files'.Does the latter scenario use as many files as you do 'COMMIT' in the former scenario?No, if you (cl-incf (multisession-value foo)) you'll get one COMMIT per time, but there'll only be one foo.value file (at a time).
OK, but it's still the same number of writes, more or less? IO is the slow part of most programs, and when it comes to an SQL database, it might have to do an update in multiple places (e.g. the data and the index), rather than do one smooth write.
Might also depend on the size of the write (how big the values are).Speaking of the latter scheme, I might be missing some details, but sqlite should provide better atomicity guarantess in the same of being interrupted mid-write. Like, if we have one-file-per-value, then the total list of keys must live somewhere, and they can get desynchronized.
[Prev in Thread] | Current Thread | [Next in Thread] |