Tutorial
Version: 0.2.2 Last Updated: 06/05/06 15:31:56
View: Paged  |  One Page

This tutorial provides a relatively simple walking tour through the basic concepts of SQLAlchemy. You may wish to skip it and dive into the main manual which is more reference-oriented. The examples in this tutorial comprise a fully working interactive Python session, and are guaranteed to be functioning courtesy of doctest.

Installation
Installing SQLAlchemy

Installing SQLAlchemy from scratch is most easily achieved with setuptools. (setuptools installation). Just run this from the command-line:

$ easy_install SQLAlchemy

This command will download the latest version of SQLAlchemy from the Python Cheese Shop and install it to your system.

Otherwise, you can install from the distribution using the setup.py script:

$ python setup.py install
back to section top
Installing a Database API

SQLAlchemy is designed to operate with a DBAPI implementation built for a particular database, and includes support for the most popular databases. If you have one of the supported DBAPI implementations, you can proceed to the following section. Otherwise SQLite is an easy-to-use database to get started with, which works with plain files or in-memory databases.

To work with SQLite, you'll need:

  • pysqlite - Python interface for SQLite
  • SQLite library

Note that the SQLite library download is not required with Windows, as the Windows Pysqlite library already includes it linked in. Pysqlite and SQLite can also be installed on Linux or FreeBSD via pre-made packages or from sources.

back to section top
Getting Started
Imports

SQLAlchemy provides the entire namespace of everything you'll need under the module name sqlalchemy. For the purposes of this tutorial, we will import its full list of symbols into our own local namespace.

>>> from sqlalchemy import *
back to section top
Connecting to the Database

After our imports, the next thing we need is a handle to the desired database, represented by an Engine object. This object handles the business of managing connections and dealing with the specifics of a particular database. Below, we will make a SQLite connection to a file-based database called "tutorial.db".

>>> db = create_engine('sqlite:///tutorial.db')

For full information on creating database engines, including those for SQLite and others, see Database Engines.

back to section top
Working with Database Objects

A core philosophy of SQLAlchemy is that tables and domain classes are different beasts. For this reason, SQLAlchemy provides constructs that represent tables by themselves (known as table metadata). So we will begin by constructing table metadata objects and performing SQL operations with them directly. Later, we will look into SQLAlchemy's Object Relational Mapper (ORM), which provides an additional layer of abstraction onto table metadata, allowing us to load and save objects of any arbitrary Python class.

Defining Metadata, Binding to Engines

Firstly, your Tables have to belong to a collection called MetaData. We will create a handy form of MetaData that automatically connects to our Engine (connecting a schema object to an Engine is called binding):

>>> metadata = BoundMetaData(db)

An equivalent operation is to create the BoundMetaData object directly with an Engine URL, which calls the create_engine call for us:

>>> metadata = BoundMetaData('sqlite:///tutorial.db')

Now, when we tell "metadata" about the tables in our database, we can issue CREATE statements for those tables, as well as create and execute SQL statements derived from them, without needing to open or close any connections; that will be all done automatically. Note that this feature is entirely optional. SQLAlchemy includes full support for explicit Connections used with schema and SQL constructs that are entirely unbound to any Engine.

For the purposes of this tutorial, we will stick with "bound" objects, as it makes the code simpler and easier to read.

back to section top
Creating a Table

With metadata as our established home for tables, lets make a Table for it:

>>> users_table = Table('users', metadata,
...     Column('user_id', Integer, primary_key=True),
...     Column('user_name', String(40)),
...     Column('password', String(10))
... )

As you might have guessed, we have just defined a table named users which has three columns: user_id (which is a primary key column), user_name and password. Currently it is just an object that doesn't necessarily correspond to an existing table in our database. To actually create the table, we use the create() method. To make it interesting, we will have SQLAlchemy echo the SQL statements it sends to the database, by setting the echo flag on the Engine associated with our BoundMetaData:

>>> metadata.engine.echo = True
>>> users_table.create() 
CREATE TABLE users(
    user_id INTEGER NOT NULL PRIMARY KEY,
    user_name VARCHAR(40),
    password VARCHAR(10)
)
...

Alternatively, the users table might already exist (such as, if you're running examples from this tutorial for the second time), in which case you can just skip the create() method call. You can even skip defining the individual columns in the users table and ask SQLAlchemy to load its definition from the database:

>>> users_table = Table('users', metadata, autoload=True)
>>> list(users_table.columns)[0].name
'user_id'

Documentation on table metadata is available in Database Meta Data.

back to section top
Inserting Rows

Inserting is achieved via the insert() method, which defines a clause object (known as a ClauseElement) representing an INSERT statement:

>>> i = users_table.insert()
>>> i 
<sqlalchemy.sql.Insert object at 0x...>
>>> print i
INSERT INTO users (user_id, user_name, password) VALUES (?, ?, ?)

Since we created this insert statement object from the users table which is bound to our Engine, the statement itself is also bound to the Engine, and supports executing itself. The execute() method of the clause object will compile the object into a string according to the underlying dialect of the Engine to which the statement is bound, and will then execute the resulting statement.

>>> i.execute(user_name='Mary', password='secure') 
INSERT INTO users (user_name, password) VALUES (?, ?)
['Mary', 'secure']
COMMIT
<sqlalchemy.engine.base.ResultProxy instance at 0x...>

>>> i.execute({'user_name':'Tom'}, {'user_name':'Fred'}, {'user_name':'Harry'}) 
INSERT INTO users (user_name) VALUES (?)
[['Tom'], ['Fred'], ['Harry']]
COMMIT
<sqlalchemy.engine.base.ResultProxy instance at 0x...>

Note that the VALUES clause of each INSERT statement was automatically adjusted to correspond to the parameters sent to the execute() method. This is because the compilation step of a ClauseElement takes into account not just the constructed SQL object and the specifics of the type of database being used, but the execution parameters sent along as well.

When constructing clause objects, SQLAlchemy will bind all literal values into bind parameters. On the construction side, bind parameters are always treated as named parameters. At compilation time, SQLAlchemy will convert them into their proper format, based on the paramstyle of the underlying DBAPI. This works equally well for all named and positional bind parameter formats described in the DBAPI specification.

Documentation on inserting: Inserts.

back to section top
Selecting

Let's check that the data we have put into users table is actually there. The procedure is analogous to the insert example above, except you now call the select() method off the users table:

>>> s = users_table.select()
>>> print s
SELECT users.user_id, users.user_name, users.password 
FROM users
>>> r = s.execute()
SELECT users.user_id, users.user_name, users.password 
FROM users
[]

This time, we won't ignore the return value of execute(). Its an instance of ResultProxy, which is a result-holding object that behaves very similarly to the cursor object one deals with directly with a database API:

>>> r 
<sqlalchemy.engine.base.ResultProxy instance at 0x...>
>>> r.fetchone()
(1, u'Mary', u'secure')
>>> r.fetchall()
[(2, u'Tom', None), (3, u'Fred', None), (4, u'Harry', None)]

Query criterion for the select is specified using Python expressions, using the Column objects in the Table as a base. All expressions constructed from Column objects are themselves instances of ClauseElements, just like the Select, Insert, and Table objects themselves.

>>> r = users_table.select(users_table.c.user_name=='Harry').execute()
SELECT users.user_id, users.user_name, users.password 
FROM users 
WHERE users.user_name = ?
['Harry']
>>> row = r.fetchone()
>>> print row
(4, u'Harry', None)

Pretty much the full range of standard SQL operations are supported as constructed Python expressions, including joins, ordering, grouping, functions, correlated subqueries, unions, etc. Documentation on selecting: Simple Select.

back to section top
Working with Rows

You can see that when we print out the rows returned by an execution result, it prints the rows as tuples. These rows in fact support both the list and dictionary interfaces. The dictionary interface allows the addressing of columns by string column name, or even the original Column object:

>>> row.keys()
['user_id', 'user_name', 'password']
>>> row['user_id'], row[1], row[users_table.c.password] 
(4, u'Harry', None)

Addressing the columns in a row based on the original Column object is especially handy, as it eliminates the need to work with literal column names altogether.

Result sets also support iteration. We'll show this with a slightly different form of select that allows you to specify the specific columns to be selected:

>>> for row in select([users_table.c.user_id, users_table.c.user_name]).execute(): 
...     print row
SELECT users.user_id, users.user_name
FROM users
[]
(1, u'Mary')
(2, u'Tom')
(3, u'Fred')
(4, u'Harry')
back to section top
Table Relationships

Lets create a second table, email_addresses, which references the users table. To define the relationship between the two tables, we will use the ForeignKey construct. We will also issue the CREATE statement for the table in one step:

>>> email_addresses_table = Table('email_addresses', metadata,
...     Column('address_id', Integer, primary_key=True),
...     Column('email_address', String(100), nullable=False),
...     Column('user_id', Integer, ForeignKey('users.user_id'))).create() 
CREATE TABLE email_addresses(
    address_id INTEGER NOT NULL PRIMARY KEY,
    email_address VARCHAR(100) NOT NULL,
    user_id INTEGER REFERENCES users(user_id)
)
...

Above, the email_addresses table is related to the users table via the ForeignKey('users.user_id'). The ForeignKey constructor can take a Column object or a string representing the table and column name. When using the string argument, the referenced table must exist within the same MetaData object; thats where it looks for the other table!

Next, lets put a few rows in:

>>> email_addresses_table.insert().execute(
...     {'email_address':'tom@tom.com', 'user_id':2},
...     {'email_address':'mary@mary.com', 'user_id':1}) 
INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?)
[['tom@tom.com', 2], ['mary@mary.com', 1]]
COMMIT
<sqlalchemy.engine.base.ResultProxy instance at 0x...>

With two related tables, we can now construct a join amongst them using the join method:

>>> r = users_table.join(email_addresses_table).select().execute()
SELECT users.user_id, users.user_name, users.password, email_addresses.address_id, email_addresses.email_address, email_addresses.user_id 
FROM users JOIN email_addresses ON users.user_id = email_addresses.user_id
[]
>>> print [row for row in r]
[(1, u'Mary', u'secure', 2, u'mary@mary.com', 1), (2, u'Tom', None, 1, u'tom@tom.com', 2)]

The join method is also a standalone function in the sqlalchemy namespace. The join condition is figured out from the foreign keys of the Table objects given. The condition (also called the "ON clause") can be specified explicitly, such as in this example where we locate all users that used their email address as their password:

>>> print join(users_table, email_addresses_table, 
...     and_(users_table.c.user_id==email_addresses_table.c.user_id, 
...     users_table.c.password==email_addresses_table.c.email_address)
...     )
users JOIN email_addresses ON users.user_id = email_addresses.user_id AND users.password = email_addresses.email_address
back to section top
Working with Object Mappers

Now that we have a little bit of Table and SQL operations covered, lets look into SQLAlchemy's ORM (object relational mapper). With the ORM, you associate Tables (and other Selectable units, like queries and table aliases) with Python classes, into units called Mappers. Then you can execute queries that return lists of object instances, instead of result sets. The object instances themselves are associated with an object called a Session, which automatically tracks changes on each object and supports a "save all at once" operation called a flush.

Creating a Mapper

A Mapper is usually created once per Python class, and at its core primarily means to say, "objects of this class are to be stored as rows in this table". Lets create a class called User, which will represent a user object that is stored in our users table:

>>> class User(object):
...     def __repr__(self):
...         return "(User %s,password:%s)" % (self.user_name, self.password)

The class is a new style class (i.e. it extends object) and does not require a constructor (although one may be provided if desired). We just have one __repr__ method on it which will display basic information about the User. Note that the __repr__ method references the instance variables user_name and password which otherwise aren't defined. While we are free to explicitly define these attributes and treat them normally, this is optional; as SQLAlchemy's Mapper construct will manage them for us, since their names correspond to the names of columns in the users table. Lets create a mapper, and observe that these attributes are now defined:

>>> usermapper = mapper(User, users_table)
>>> u1 = User()
>>> print u1.user_name
None
>>> print u1.password
None

The mapper function returns a new instance of Mapper. As it is the first Mapper we have created for the User class, it is known as the classes' primary mapper. We generally don't need to hold onto the usermapper instance variable; SA's ORM can automatically locate this Mapper when it deals with the class, or instances of that class.

back to section top
Obtaining a Session

After you create a Mapper, all operations with that Mapper require the usage of an important object called a Session. All objects loaded or saved by the Mapper must be attached to a Session object, which represents a kind of "workspace" of objects that are loaded into memory. A particular object instance can only be attached to one Session at a time.

By default, you have to create a Session object explicitly before you can load or save objects. Theres several ways to manage sessions, but the most straightforward is to just create one, which we will do by saying, create_session():

>>> session = create_session()
>>> session 
<sqlalchemy.orm.session.Session object at 0x...>
back to section top
The Query Object

The Session has all kinds of methods on it to retrieve and store objects, and also to view their current status. The Session also provides an easy interface which can be used to query the database, by giving you an instance to a Query object corresponding to a particular Python class:

>>> query = session.query(User)
>>> print query.select_by(user_name='Harry')
SELECT users.user_name AS users_user_name, users.password AS users_password, users.user_id AS users_user_id 
FROM users 
WHERE users.user_name = ? ORDER BY users.oid
['Harry']
[(User Harry,password:None)]

All querying for objects is performed via an instance of Query. The various select methods on an instance of Mapper also use an underlying Query object to perform the operation. A Query is always bound to a specific Session.

Lets turn off the database echoing for a moment, and try out a few methods on Query. Methods that end with the suffix _by primarily take keyword arguments which correspond to properties on the object. Other methods take ClauseElement objects, which are constructed by using Column objects inside of Python expressions, in the same way as we did with our SQL select example in the previous section of this tutorial. Using ClauseElement structures to query objects is more verbose but more flexible:

>>> metadata.engine.echo = False
>>> print query.select(User.c.user_id==3)
[(User Fred,password:None)]
>>> print query.get(2)
(User Tom,password:None)
>>> print query.get_by(user_name='Mary')
(User Mary,password:secure)
>>> print query.selectfirst(User.c.password==None)
(User Tom,password:None)
>>> print query.count()
4

Notice that our User class has a special attribute c attached to it. This 'c' represents the columns on the User's mapper's Table object. Saying User.c.user_name is synonymous with saying users_table.c.user_name, recalling that User is the Python class and users is our Table object.

back to section top
Making Changes

With a little experience in loading objects, lets see what its like to make changes. First, lets create a new user "Ed". We do this by just constructing the new object. Then, we just add it to the session:

>>> ed = User()
>>> ed.user_name = 'Ed'
>>> ed.password = 'edspassword'
>>> session.save(ed)
>>> ed in session
True

Lets also make a few changes on some of the objects in the database. We will load them with our Query object, and then change some things.

>>> mary = query.get_by(user_name='Mary')
>>> harry = query.get_by(user_name='Harry')
>>> mary.password = 'marysnewpassword'
>>> harry.password = 'harrysnewpassword'

At the moment, nothing has been saved to the database; all of our changes are in memory only. What happens if some other part of the application also tries to load 'Mary' from the database and make some changes before we had a chance to save it ? Assuming that the same Session is used, loading 'Mary' from the database a second time will issue a second query in order locate the primary key of 'Mary', but will return the same object instance as the one already loaded. This behavior is due to an important property of the Session known as the identity map:

>>> mary2 = query.get_by(user_name='Mary')
>>> mary is mary2
True

With the identity map, a single Session can be relied upon to keep all loaded instances straight.

As far as the issue of the same object being modified in two different Sessions, that's an issue of concurrency detection; SQLAlchemy does some basic concurrency checks when saving objects, with the option for a stronger check using version ids. See None for more details.

back to section top
Saving

With a new user "ed" and some changes made on "Mary" and "Harry", lets also mark "Fred" as deleted:

>>> fred = query.get_by(user_name='Fred')
>>> session.delete(fred)

Then to send all of our changes to the database, we flush() the Session. Lets turn echo back on to see this happen!:

>>> metadata.engine.echo = True
>>> session.flush()
BEGIN
UPDATE users SET password=? WHERE users.user_id = ?
['marysnewpassword', 1]
UPDATE users SET password=? WHERE users.user_id = ?
['harrysnewpassword', 4]
INSERT INTO users (user_name, password) VALUES (?, ?)
['Ed', 'edspassword']
DELETE FROM users WHERE users.user_id = ?
[3]
COMMIT
back to section top
Relationships

When our User object contains relationships to other kinds of information, such as a list of email addresses, we can indicate this by using a function when creating the Mapper called relation(). While there is a lot you can do with relations, we'll cover a simple one here. First, recall that our users table has a foreign key relationship to another table called email_addresses. A single row in email_addresses has a column user_id that references a row in the users table; since many rows in the email_addresses table can reference a single row in users, this is called a one to many relationship.

First, deal with the email_addresses table by itself. We will create a new class Address which represents a single row in the email_addresses table, and a corresponding Mapper which will associate the Address class with the email_addresses table:

>>> class Address(object):
...     def __init__(self, email_address):
...         self.email_address = email_address
...     def __repr__(self):
...         return "(Address %s)" % (self.email_address)
>>> mapper(Address, email_addresses_table) 
<sqlalchemy.orm.mapper.Mapper object at 0x...>

Next, we associate the User and Address classes together by creating a relation using relation(), and then adding that relation to the User mapper, using the add_property function:

>>> usermapper.add_property('addresses', relation(Address))

The relation() function takes either a class or a Mapper as its first argument, and has many options to further control its behavior. The 'User' mapper has now placed additional property on each User instance called addresses. SQLAlchemy will automatically determine that this relationship is a one-to-many relationship, and will subsequently create addresses as a list. When a new User is created, this list will begin as empty.

Lets see what we get for the email addresses already in the database. Since we have made a change to the mapper's configuration, its best that we clear out our Session, which is currently holding onto every User object we have already loaded:

>>> session.clear()

We can then treat the addresses attribute on each User object like a regular list:

>>> mary = query.get_by(user_name='Mary') 
SELECT users.user_name AS users_user_name, users.password AS users_password, users.user_id AS users_user_id 
FROM users 
WHERE users.user_name = ? ORDER BY users.oid 
LIMIT 1 OFFSET 0
['Mary']
>>> print [a for a in mary.addresses]
SELECT email_addresses.user_id AS email_addresses_user_id, email_addresses.address_id AS email_addresses_address_id, email_addresses.email_address AS email_addresses_email_address 
FROM email_addresses 
WHERE ? = email_addresses.user_id ORDER BY email_addresses.oid
[1]
[(Address mary@mary.com)]

Adding to the list is just as easy. New Address objects will be detected and saved when we flush the Session:

>>> mary.addresses.append(Address('mary2@gmail.com'))
>>> session.flush() 
BEGIN
INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?)
['mary2@gmail.com', 1]
COMMIT

Main documentation for using mappers: Data Mapping

back to section top
Transactions

You may have noticed from the example above that when we say session.flush(), SQLAlchemy indicates the names BEGIN and COMMIT to indicate a transaction with the database. The flush() method, since it may execute many statements in a row, will automatically use a transaction in order to execute these instructions. But what if we want to use flush() inside of a larger transaction? This is performed via the SessionTransaction object, which we can establish using session.create_transaction(). Below, we will perform a more complicated SELECT statement, make several changes to our collection of users and email addresess, and then create a new user with two email addresses, within the context of a transaction. We will perform a flush() in the middle of it to write the changes we have so far, and then allow the remaining changes to be written when we finally commit() the transaction. We enclose our operations within a try/except block to insure that resources are properly freed:

>>> transaction = session.create_transaction()
>>> try: 
...     (ed, harry, mary) = session.query(User).select(
...         User.c.user_name.in_('Ed', 'Harry', 'Mary'), order_by=User.c.user_name
...     )
...     del mary.addresses[1]
...     harry.addresses.append(Address('harry2@gmail.com'))
...     session.flush()
...     print "***flushed the session***"
...     fred = User()
...     fred.user_name = 'fred_again'
...     fred.addresses.append(Address('fred@fred.com'))
...     fred.addresses.append(Address('fredsnewemail@fred.com'))
...     session.save(fred)
...     transaction.commit()
... except:
...     transaction.rollback()
...     raise
BEGIN
SELECT users.user_name AS users_user_name, users.password AS users_password, users.user_id AS users_user_id 
FROM users 
WHERE users.user_name IN (?, ?, ?) ORDER BY users.user_name
['Ed', 'Harry', 'Mary']
SELECT email_addresses.user_id AS email_addresses_user_id, email_addresses.address_id AS email_addresses_address_id, email_addresses.email_address AS email_addresses_email_address 
FROM email_addresses 
WHERE ? = email_addresses.user_id ORDER BY email_addresses.oid
[4]
UPDATE email_addresses SET user_id=? WHERE email_addresses.address_id = ?
[None, 3]
INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?)
['harry2@gmail.com', 4]
***flushed the session***    
INSERT INTO users (user_name, password) VALUES (?, ?)
['fred_again', None]
INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?)
['fred@fred.com', 6]
INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?)
['fredsnewemail@fred.com', 6]
COMMIT

Main documentation: Session / Unit of Work

back to section top
Next Steps

That covers a quick tour through the basic idea of SQLAlchemy, in its simplest form. Beyond that, one should familiarize oneself with the basics of Sessions, the various patterns that can be used to define different kinds of Mappers and relations among them, the rudimentary SQL types that are available when constructing Tables, and the basics of Engines, SQL statements, and database Connections.

back to section top