sqlite
Links
- CLI Shell info: http://www.sqlite.org/sqlite.html
- Better CLI tutorial: http://souptonuts.sourceforge.net/readme_sqlite_tutorial.html
- FAQ - http://www.sqlite.org/faq.html
- When to use SQLite - https://www.sqlite.org/whentouse.html
- SpatiaLite - "SpatiaLite is an open source library intended to extend the SQLite core to support fully fledged Spatial SQL capabilities."
- Improved CLI - https://github.com/dbcli/litecli
- GUI tool with sqlite support - https://dbeaver.io
- "An open source multi-tool for exploring and publishing data" https://docs.datasette.io/en/stable/ / https://simonwillison.net/2018/Aug/19/instantly-publish-datasette/
- https://www.sqlite.org/appfileformat.html
- https://sqlite.org/src4/doc/trunk/www/design.wiki The Design of SQLite4
- https://til.simonwillison.net/sqlite/one-line-csv-operations Simon Willison has a lot of great sqlite knowledge, tools, and examples.
- https://sqlitestudio.pl: GUI app to "Create, edit, browse SQLite databases."
Syntax Examples
Open a db read-only
sqlite3 "file:///absolute/path/to/file/datasette.db?mode=ro"
Import a csv file
This method does not appear to support ~/filename
or $HOME/filename
, but does support relative and absolute paths. The sqlite3
help text says that -csv
will "set output mode to 'csv'", but it also affects .import
statements.
sqlite3 -csv filename.db ".import path/to/some_file.csv destination_table_name"
Export to a csv file, including headers
This would export the cards
database table as a csv:
sqlite3 -csv pokemon-card-collection.db -cmd '.headers on' 'select * from cards' > pokemon-cards.csv
Create a table
CREATE TABLE servers (
id INTEGER NOT NULL,
hostname VARCHAR(255),
ip_addr VARCHAR(32),
PRIMARY KEY (id), UNIQUE (id,hostname));
Or from a unix shell
sqlite3 foo.db "CREATE TABLE servers (
id INTEGER NOT NULL,
hostname VARCHAR(255),
ip_addr VARCHAR(32),
PRIMARY KEY (id),
UNIQUE (id,hostname));"
Add a column to the table
ALTER TABLE servers ADD os varchar(255);
Create a view
"a view is the result set of a stored query, which can be queried in the same manner as a persistent database collection object" - https://en.wikipedia.org/wiki/View_(SQL)
https://www.sqlite.org/lang_createview.html
The following view would show only rows from servers
where the ip_addr
starts with 192.168.
. This is an effective way to move logic into the database, potentially reudcing app complexity:
CREATE VIEW local_servers AS
SELECT hostname,ip_addr FROM servers WHERE ip_addr like '192.168.%' ;
Add rows to the table from unix shell
sqlite3 foo.db "insert into baz values ('50','some text');"
Add rows or update if the row already exists
This syntax is different from other SQL implementations
insert or replace into tablename(filename, hash) values
('/etc/hosts', 'somehash'),
('/etc/resolv.conf', 'someotherhash');
Output rows via the unix shell
This outputs as columns, but csv, html line, and list exist too.
sqlite3 -column foo.db "SELECT * FROM baz LIMIT 5;"
If -column
is truncating your output, instead use -list -separator ' '
-line
is the equivalent of mysql's \G
or postgres's \x
Select a field where its value occurs more than N times
select DateTimeOriginal
from photos
group by DateTimeOriginal
having count(DateTimeOriginal) > 1 ;
Select field A where the value Field B occurs more than N times
This selects all values for field A (SourceFile), including where duplicates exist for field B (DateTimeOriginal). The prior example would not have shown this if we had added SourceFile
to the select
.
select SourceFile from photos
where DateTimeOriginal in (
select DateTimeOriginal from photos
group by DateTimeOriginal
having count(DateTimeOriginal) > 1
) order by SourceFile ;
Select a random row from a table
TABLE='costreport'
sqlite3 -line CostReport-1.db "SELECT * FROM $TABLE
WHERE _ROWID_ >= (abs(random()) % (SELECT max(_ROWID_) FROM $TABLE))
LIMIT 1 ;"
Select data from a table and include an incrementing id column
Given the following data:
sqlite> .schema
CREATE TABLE IF NOT EXISTS "people"(
"name" TEXT, "age" TEXT);
sqlite> .mode box
sqlite> select * from people ;
┌───────┬─────┐
│ name │ age │
├───────┼─────┤
│ brad │ 18 │
│ david │ 9 │
│ grace │ 29 │
│ john │ 51 │
│ katie │ 23 │
│ nora │ 33 │
└───────┴─────┘
You can sort by age and add an id column to show the numeric ordering of their age:
sqlite> select row_number() over (order by cast(age as integer)) as id, * from people ;
┌────┬───────┬─────┐
│ id │ name │ age │
├────┼───────┼─────┤
│ 1 │ david │ 9 │
│ 2 │ brad │ 18 │
│ 3 │ katie │ 23 │
│ 4 │ grace │ 29 │
│ 5 │ nora │ 33 │
│ 6 │ john │ 51 │
└────┴───────┴─────┘
The row_number() over (order by cast(age as integer)) as id
adds the extra column. Because the age
column is a string we have to cast it to an integer for the sort to work correctly.
Sum one column ordered by another column
This does a quick cost analysis on an AWS Cost-Report-1.csv file.
## First sanitize the column names
sed -i '1 s#[/: ]#_#g' CostReport-1.csv # linux sed, macos use gsed
## Next import the csv into a sqlite db
sqlite3 -csv CostReport-1.db ".import CostReport-1.csv costreport"
## Then start a sqlite shell
sqlite3 CostReport-1.db
-- Output Usage Type, ARN or ID, and summed cost as columns
SELECT lineItem_UsageType, lineItem_ResourceId, sum(lineItem_BlendedCost) cost
FROM costreport
GROUP BY lineItem_ResourceId
ORDER BY cost ;
Dump a db from CLI
sqlite3 foo.db ".dump" > foo.sql
Search Skype chat history
sqlite3 "$HOME/Library/Application Support/Skype/daniel.austin/main.db" \
"SELECT author, timestamp, body_xml FROM messages WHERE body_xml LIKE '%music%' ;"
Expanded functionality skype history search
function skypesearch(){
skypeusername=$1
searchstring=$2
/usr/bin/env sqlite3 "$HOME/Library/Application Support/Skype/${skypeusername}/main.db" \
"SELECT author, datetime(timestamp,'unixepoch','localtime'), body_xml
FROM messages
WHERE body_xml
LIKE '%${searchstring}%' ;"
}
alias ss="skypesearch john.doe"
Quickly create an image database
A better way to do this would be to use sqlite-utils, which is part of datasette, and there is an example below that shows how to do this.
## Create the database
sqlite3 images.db "create table images (filename varchar(255), createdate timestamp, unique(filename))"
## Populate the database. This can be blindly re-run when new files are added.
exiftool -d "%s" -p 'insert into images values ("$filename", "$DateTimeOriginal");' -q -f -r . | sqlite3 images.db 2> /dev/null
## Query the database
sqlite3 images.db "SELECT filename,datetime(createdate,'unixepoch','localtime') as date FROM images WHERE date LIKE '2014-08-02%';"
Use exiftool and sqlite-utils to find duplicate photos
This example shows how to delete JPG files where a DNG also exists, for photos taken with a camera that keeps track of the ImageNumber, like most DSLRs.
First, ingest all the exif data from your photos into a sqlite db. We specify only the fields we need from exiftool, and output the data as a json blob, then use jq
to reformat the json as one record per line, then send that into sqlite-utils
. The sqlite-utils
--nl
option tells it to expect one record per line, --pk SourceFile
specifies that field (which is always present in exiftool json output) as the primary key to ensure uniqueness, --replace
will update any rows where the primary key already exists (so we can re-run this on an existing database). If we were not specifying exiftool fields, we would need an --alter
flag here to tell sqlite-utils to add columns to the table for exif fields that did not exist in any previously imported photos.
SOURCE_DIR=/some/absolute/path/with/photos
find "${SOURCE_DIR}" -type f -iname '*.jpg' -print0 -or -iname '*.dng' -print0 |
xargs -0 exiftool -SerialNumber -ImageNumber -FileType -json |
jq -c '.[]' |
sqlite-utils insert --nl --pk SourceFile --replace ~/photos.db photos -
Now do a sql select for all JPG files that finds the duplicates. The idea here is to find distinct shots from a specific camera body which are identified by concatenating the SerialNumber and the ImageNumber together, and only select rows with FileType = JPEG where that same shot also has a FileType = DNG entry. This avoids deletion of JPG files where there is no DNG for the same shot.
This example does echo rm
just for the example. Remove echo
to actually do the deletion. This example also shows how to interpolate fields in sqlite3 by using ||
to concatenate fields and strings to produce a single string that is used to compare with, or output as a result.
sqlite3 ~/photos.db "
select SourceFile from photos
where FileType = 'JPEG'
and SerialNumber || ImageNumber in (
select SerialNumber || ImageNumber from photos where FileType = 'DNG'
) ;
" |
xargs -r echo rm -fv
Vacuum a database file
"The VACUUM command rebuilds the database file, repacking it into a minimal amount of disk space." - https://www.sqlite.org/lang_vacuum.html
sqlite3 filename.db "VACUUM;"