Quickstart¶
Creating New Tables¶
To create a new table one has to use the memory implementation. This is quite simple:
from ljson import Table, Header
# create the header
header = Header({"id": {"type": "int", "modifiers":["unique"]},
"name": {"type": "str", "modifiers": []},
"age": {"type": "int", "modifiers":[]}
})
table = Table(header, [])
If one wants to use the on disk implementation the table has to be saved:
from ljson.disk import Table as DiskTable
# save the table on the disk
fio = open("test.ljson", "w+")
table.save(fio)
# use the on-disk implementation
table = DiskTable.from_file(fio)
Inserting Data¶
Inserting data into a table is done by using the method
additem
that takes a dict
containing the data:
table.additem({"id": 0, "name": "Peter", "age": 20})
table.additem({"id": 1, "name": "Gustav", "age": 17})
table.additem({"id": 2, "name": "Peter", "age": 21})
table.additem({"id": 3, "name": "Sally", "age": 17})
Using Tables¶
ljson tables are using a quite pythonic interface. Accessing elements is done by either iterating:
for row in table:
# do something with row
# row is a dict containing the data
print(row)
Note: You cannot set data using this method.
Or by using queries.
A query is basically indexing the table with a dict.
This returns a Selector object. A selector is iterable
and single elements can be accessed by the method
getone
:
for row in table[{"name": "Peter"}]:
print(row)
table[{"age": 17}]["age"] = 18
peter_1_age = table[{"id":0}].getone("age")
If the dict contains more than one key value pair
all conditions will be joined by logical and
.
Cookbook¶
This chapter contains some recipes for ljson.
Simple Min and Max¶
As ljson tables are iterable, the default min and max functions will work:
youngest = min(table, key = lambda row: row["age"])
As suggested in “Data Science from Scratch” by Joel Grus one can write a simple function for that:
def picker(keyname):
return lambda row: row[keyname]
oldest = max(table, key = picker("age"))
Min and Max and Queries¶
Selectors are iterable too, so... guess what:
oldest_peter = max(table[{"name":"peter"}], key = picker("age"))
Converting CSV to LJSON¶
Assuming you have a file called input.csv
and you want
to convert it to a file output.ljson
, you can use the
function ljson.convert.csv.csv2file
. At first take
a look at your file:
id,age,name
0,20,Peter
1,17,Gustav
2,21,Peter
3,17,Sally
Now open the files and convert them:
from ljson.convert.csv import *
fio = open("input.csv", "r")
fout = open("output.ljson", "w")
disk_table = csv2file(fio, fout, types = {"name": "str", "id": "int", "age": "int"})
fio.close()
fout.close()
This is the content of output.ljson:
{"__type__": "header", "id": {"type": "int", "modifiers": []}, "age": {"type": "int", "modifiers": []}, "name": {"type": "str", "modifiers": []}}
{"id": 0, "age": 20, "name": "Peter"}
{"id": 1, "age": 17, "name": "Gustav"}
{"id": 2, "age": 21, "name": "Peter"}
{"id": 3, "age": 17, "name": "Sally"}
Converting LJSON to CSV¶
This is pretty simple as well. It is recommended to use the on disk implementation for those conversions, as they avoid loading tons of data into your ram.
fout = open("output.csv", "w")
table2csv(disk_table, fout)
fout.close()
Using Context Managers¶
Since version 0.1.0 ljson tables are context managers. This makes it easy to manage disk tables:
with DiskTable.open("output.json") as table:
with open("output.csv", "w") as fout:
table2csv(table, fout)
# now both fout and table are closed properly.
Deleting Items¶
Deleting rows is supported since 0.1.1:
with DiskTable.open("output.json") as table:
del(table[{"id": 0}])
Gotchas¶
In v0.3.0 a new feature has been added:
LjsonQueryResult
s. Those objects are returned by
LjsonSelector.__getitem__
and should fulfill two
purposes:
- Behave like a list for nearly everything
- Make it possible to edit tables in a pythonic way.
Therefore something like this is possible:
with DiskTable.open("data.json") as table:
table[{"id": 0}]["age"] += 1
Which is pretty nice. But it also should work like a list, so:
with DiskTable.open("data.json") as table:
result = table[{"id": 0}]["age"] + [1]
Will produce [21, 1]
.
So unluckily:
with DiskTable.open("data.json") as table:
table[{"id": 0}]["age"] = table[{"id": 0}]["age"] + 1
Will fail (and in general this leads to undefined behaviour).
The LjsonQueryResult
class overrides all __i*__
methods, while all other methods are passed to the
underlaying list
.