SQLAlchemy 0.2 Documentation
Version: 0.2.2 Last Updated: 06/05/06 15:31:56
View: Paged  |  One Page
Table of Contents    (view full table)

Table of Contents: Full    (view brief table)

This tutorial provides a relatively simple walking tour through the basic concepts of SQLAlchemy. You may wish to skip it and dive into the main manual which is more reference-oriented. The examples in this tutorial comprise a fully working interactive Python session, and are guaranteed to be functioning courtesy of doctest.

Installation
Installing SQLAlchemy

Installing SQLAlchemy from scratch is most easily achieved with setuptools. (setuptools installation). Just run this from the command-line:

$ easy_install SQLAlchemy

This command will download the latest version of SQLAlchemy from the Python Cheese Shop and install it to your system.

Otherwise, you can install from the distribution using the setup.py script:

$ python setup.py install
back to section top
Installing a Database API

SQLAlchemy is designed to operate with a DBAPI implementation built for a particular database, and includes support for the most popular databases. If you have one of the supported DBAPI implementations, you can proceed to the following section. Otherwise SQLite is an easy-to-use database to get started with, which works with plain files or in-memory databases.

To work with SQLite, you'll need:

  • pysqlite - Python interface for SQLite
  • SQLite library

Note that the SQLite library download is not required with Windows, as the Windows Pysqlite library already includes it linked in. Pysqlite and SQLite can also be installed on Linux or FreeBSD via pre-made packages or from sources.

back to section top
Getting Started
Imports

SQLAlchemy provides the entire namespace of everything you'll need under the module name sqlalchemy. For the purposes of this tutorial, we will import its full list of symbols into our own local namespace.

>>> from sqlalchemy import *
back to section top
Connecting to the Database

After our imports, the next thing we need is a handle to the desired database, represented by an Engine object. This object handles the business of managing connections and dealing with the specifics of a particular database. Below, we will make a SQLite connection to a file-based database called "tutorial.db".

>>> db = create_engine('sqlite:///tutorial.db')

For full information on creating database engines, including those for SQLite and others, see Database Engines.

back to section top
Working with Database Objects

A core philosophy of SQLAlchemy is that tables and domain classes are different beasts. For this reason, SQLAlchemy provides constructs that represent tables by themselves (known as table metadata). So we will begin by constructing table metadata objects and performing SQL operations with them directly. Later, we will look into SQLAlchemy's Object Relational Mapper (ORM), which provides an additional layer of abstraction onto table metadata, allowing us to load and save objects of any arbitrary Python class.

Defining Metadata, Binding to Engines

Firstly, your Tables have to belong to a collection called MetaData. We will create a handy form of MetaData that automatically connects to our Engine (connecting a schema object to an Engine is called binding):

>>> metadata = BoundMetaData(db)

An equivalent operation is to create the BoundMetaData object directly with an Engine URL, which calls the create_engine call for us:

>>> metadata = BoundMetaData('sqlite:///tutorial.db')

Now, when we tell "metadata" about the tables in our database, we can issue CREATE statements for those tables, as well as create and execute SQL statements derived from them, without needing to open or close any connections; that will be all done automatically. Note that this feature is entirely optional. SQLAlchemy includes full support for explicit Connections used with schema and SQL constructs that are entirely unbound to any Engine.

For the purposes of this tutorial, we will stick with "bound" objects, as it makes the code simpler and easier to read.

back to section top
Creating a Table

With metadata as our established home for tables, lets make a Table for it:

>>> users_table = Table('users', metadata,
...     Column('user_id', Integer, primary_key=True),
...     Column('user_name', String(40)),
...     Column('password', String(10))
... )

As you might have guessed, we have just defined a table named users which has three columns: user_id (which is a primary key column), user_name and password. Currently it is just an object that doesn't necessarily correspond to an existing table in our database. To actually create the table, we use the create() method. To make it interesting, we will have SQLAlchemy echo the SQL statements it sends to the database, by setting the echo flag on the Engine associated with our BoundMetaData:

>>> metadata.engine.echo = True
>>> users_table.create() 
CREATE TABLE users(
    user_id INTEGER NOT NULL PRIMARY KEY,
    user_name VARCHAR(40),
    password VARCHAR(10)
)
...

Alternatively, the users table might already exist (such as, if you're running examples from this tutorial for the second time), in which case you can just skip the create() method call. You can even skip defining the individual columns in the users table and ask SQLAlchemy to load its definition from the database:

>>> users_table = Table('users', metadata, autoload=True)
>>> list(users_table.columns)[0].name
'user_id'

Documentation on table metadata is available in Database Meta Data.

back to section top
Inserting Rows

Inserting is achieved via the insert() method, which defines a clause object (known as a ClauseElement) representing an INSERT statement:

>>> i = users_table.insert()
>>> i 
<sqlalchemy.sql.Insert object at 0x...>
>>> print i
INSERT INTO users (user_id, user_name, password) VALUES (?, ?, ?)

Since we created this insert statement object from the users table which is bound to our Engine, the statement itself is also bound to the Engine, and supports executing itself. The execute() method of the clause object will compile the object into a string according to the underlying dialect of the Engine to which the statement is bound, and will then execute the resulting statement.

>>> i.execute(user_name='Mary', password='secure') 
INSERT INTO users (user_name, password) VALUES (?, ?)
['Mary', 'secure']
COMMIT
<sqlalchemy.engine.base.ResultProxy instance at 0x...>

>>> i.execute({'user_name':'Tom'}, {'user_name':'Fred'}, {'user_name':'Harry'}) 
INSERT INTO users (user_name) VALUES (?)
[['Tom'], ['Fred'], ['Harry']]
COMMIT
<sqlalchemy.engine.base.ResultProxy instance at 0x...>

Note that the VALUES clause of each INSERT statement was automatically adjusted to correspond to the parameters sent to the execute() method. This is because the compilation step of a ClauseElement takes into account not just the constructed SQL object and the specifics of the type of database being used, but the execution parameters sent along as well.

When constructing clause objects, SQLAlchemy will bind all literal values into bind parameters. On the construction side, bind parameters are always treated as named parameters. At compilation time, SQLAlchemy will convert them into their proper format, based on the paramstyle of the underlying DBAPI. This works equally well for all named and positional bind parameter formats described in the DBAPI specification.

Documentation on inserting: Inserts.

back to section top
Selecting

Let's check that the data we have put into users table is actually there. The procedure is analogous to the insert example above, except you now call the select() method off the users table:

>>> s = users_table.select()
>>> print s
SELECT users.user_id, users.user_name, users.password 
FROM users
>>> r = s.execute()
SELECT users.user_id, users.user_name, users.password 
FROM users
[]

This time, we won't ignore the return value of execute(). Its an instance of ResultProxy, which is a result-holding object that behaves very similarly to the cursor object one deals with directly with a database API:

>>> r 
<sqlalchemy.engine.base.ResultProxy instance at 0x...>
>>> r.fetchone()
(1, u'Mary', u'secure')
>>> r.fetchall()
[(2, u'Tom', None), (3, u'Fred', None), (4, u'Harry', None)]

Query criterion for the select is specified using Python expressions, using the Column objects in the Table as a base. All expressions constructed from Column objects are themselves instances of ClauseElements, just like the Select, Insert, and Table objects themselves.

>>> r = users_table.select(users_table.c.user_name=='Harry').execute()
SELECT users.user_id, users.user_name, users.password 
FROM users 
WHERE users.user_name = ?
['Harry']
>>> row = r.fetchone()
>>> print row
(4, u'Harry', None)

Pretty much the full range of standard SQL operations are supported as constructed Python expressions, including joins, ordering, grouping, functions, correlated subqueries, unions, etc. Documentation on selecting: Simple Select.

back to section top
Working with Rows

You can see that when we print out the rows returned by an execution result, it prints the rows as tuples. These rows in fact support both the list and dictionary interfaces. The dictionary interface allows the addressing of columns by string column name, or even the original Column object:

>>> row.keys()
['user_id', 'user_name', 'password']
>>> row['user_id'], row[1], row[users_table.c.password] 
(4, u'Harry', None)

Addressing the columns in a row based on the original Column object is especially handy, as it eliminates the need to work with literal column names altogether.

Result sets also support iteration. We'll show this with a slightly different form of select that allows you to specify the specific columns to be selected:

>>> for row in select([users_table.c.user_id, users_table.c.user_name]).execute(): 
...     print row
SELECT users.user_id, users.user_name
FROM users
[]
(1, u'Mary')
(2, u'Tom')
(3, u'Fred')
(4, u'Harry')
back to section top
Table Relationships

Lets create a second table, email_addresses, which references the users table. To define the relationship between the two tables, we will use the ForeignKey construct. We will also issue the CREATE statement for the table in one step:

>>> email_addresses_table = Table('email_addresses', metadata,
...     Column('address_id', Integer, primary_key=True),
...     Column('email_address', String(100), nullable=False),
...     Column('user_id', Integer, ForeignKey('users.user_id'))).create() 
CREATE TABLE email_addresses(
    address_id INTEGER NOT NULL PRIMARY KEY,
    email_address VARCHAR(100) NOT NULL,
    user_id INTEGER REFERENCES users(user_id)
)
...

Above, the email_addresses table is related to the users table via the ForeignKey('users.user_id'). The ForeignKey constructor can take a Column object or a string representing the table and column name. When using the string argument, the referenced table must exist within the same MetaData object; thats where it looks for the other table!

Next, lets put a few rows in:

>>> email_addresses_table.insert().execute(
...     {'email_address':'tom@tom.com', 'user_id':2},
...     {'email_address':'mary@mary.com', 'user_id':1}) 
INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?)
[['tom@tom.com', 2], ['mary@mary.com', 1]]
COMMIT
<sqlalchemy.engine.base.ResultProxy instance at 0x...>

With two related tables, we can now construct a join amongst them using the join method:

>>> r = users_table.join(email_addresses_table).select().execute()
SELECT users.user_id, users.user_name, users.password, email_addresses.address_id, email_addresses.email_address, email_addresses.user_id 
FROM users JOIN email_addresses ON users.user_id = email_addresses.user_id
[]
>>> print [row for row in r]
[(1, u'Mary', u'secure', 2, u'mary@mary.com', 1), (2, u'Tom', None, 1, u'tom@tom.com', 2)]

The join method is also a standalone function in the sqlalchemy namespace. The join condition is figured out from the foreign keys of the Table objects given. The condition (also called the "ON clause") can be specified explicitly, such as in this example where we locate all users that used their email address as their password:

>>> print join(users_table, email_addresses_table, 
...     and_(users_table.c.user_id==email_addresses_table.c.user_id, 
...     users_table.c.password==email_addresses_table.c.email_address)
...     )
users JOIN email_addresses ON users.user_id = email_addresses.user_id AND users.password = email_addresses.email_address
back to section top
Working with Object Mappers

Now that we have a little bit of Table and SQL operations covered, lets look into SQLAlchemy's ORM (object relational mapper). With the ORM, you associate Tables (and other Selectable units, like queries and table aliases) with Python classes, into units called Mappers. Then you can execute queries that return lists of object instances, instead of result sets. The object instances themselves are associated with an object called a Session, which automatically tracks changes on each object and supports a "save all at once" operation called a flush.

Creating a Mapper

A Mapper is usually created once per Python class, and at its core primarily means to say, "objects of this class are to be stored as rows in this table". Lets create a class called User, which will represent a user object that is stored in our users table:

>>> class User(object):
...     def __repr__(self):
...         return "(User %s,password:%s)" % (self.user_name, self.password)

The class is a new style class (i.e. it extends object) and does not require a constructor (although one may be provided if desired). We just have one __repr__ method on it which will display basic information about the User. Note that the __repr__ method references the instance variables user_name and password which otherwise aren't defined. While we are free to explicitly define these attributes and treat them normally, this is optional; as SQLAlchemy's Mapper construct will manage them for us, since their names correspond to the names of columns in the users table. Lets create a mapper, and observe that these attributes are now defined:

>>> usermapper = mapper(User, users_table)
>>> u1 = User()
>>> print u1.user_name
None
>>> print u1.password
None

The mapper function returns a new instance of Mapper. As it is the first Mapper we have created for the User class, it is known as the classes' primary mapper. We generally don't need to hold onto the usermapper instance variable; SA's ORM can automatically locate this Mapper when it deals with the class, or instances of that class.

back to section top
Obtaining a Session

After you create a Mapper, all operations with that Mapper require the usage of an important object called a Session. All objects loaded or saved by the Mapper must be attached to a Session object, which represents a kind of "workspace" of objects that are loaded into memory. A particular object instance can only be attached to one Session at a time.

By default, you have to create a Session object explicitly before you can load or save objects. Theres several ways to manage sessions, but the most straightforward is to just create one, which we will do by saying, create_session():

>>> session = create_session()
>>> session 
<sqlalchemy.orm.session.Session object at 0x...>
back to section top
The Query Object

The Session has all kinds of methods on it to retrieve and store objects, and also to view their current status. The Session also provides an easy interface which can be used to query the database, by giving you an instance to a Query object corresponding to a particular Python class:

>>> query = session.query(User)
>>> print query.select_by(user_name='Harry')
SELECT users.user_name AS users_user_name, users.password AS users_password, users.user_id AS users_user_id 
FROM users 
WHERE users.user_name = ? ORDER BY users.oid
['Harry']
[(User Harry,password:None)]

All querying for objects is performed via an instance of Query. The various select methods on an instance of Mapper also use an underlying Query object to perform the operation. A Query is always bound to a specific Session.

Lets turn off the database echoing for a moment, and try out a few methods on Query. Methods that end with the suffix _by primarily take keyword arguments which correspond to properties on the object. Other methods take ClauseElement objects, which are constructed by using Column objects inside of Python expressions, in the same way as we did with our SQL select example in the previous section of this tutorial. Using ClauseElement structures to query objects is more verbose but more flexible:

>>> metadata.engine.echo = False
>>> print query.select(User.c.user_id==3)
[(User Fred,password:None)]
>>> print query.get(2)
(User Tom,password:None)
>>> print query.get_by(user_name='Mary')
(User Mary,password:secure)
>>> print query.selectfirst(User.c.password==None)
(User Tom,password:None)
>>> print query.count()
4

Notice that our User class has a special attribute c attached to it. This 'c' represents the columns on the User's mapper's Table object. Saying User.c.user_name is synonymous with saying users_table.c.user_name, recalling that User is the Python class and users is our Table object.

back to section top
Making Changes

With a little experience in loading objects, lets see what its like to make changes. First, lets create a new user "Ed". We do this by just constructing the new object. Then, we just add it to the session:

>>> ed = User()
>>> ed.user_name = 'Ed'
>>> ed.password = 'edspassword'
>>> session.save(ed)
>>> ed in session
True

Lets also make a few changes on some of the objects in the database. We will load them with our Query object, and then change some things.

>>> mary = query.get_by(user_name='Mary')
>>> harry = query.get_by(user_name='Harry')
>>> mary.password = 'marysnewpassword'
>>> harry.password = 'harrysnewpassword'

At the moment, nothing has been saved to the database; all of our changes are in memory only. What happens if some other part of the application also tries to load 'Mary' from the database and make some changes before we had a chance to save it ? Assuming that the same Session is used, loading 'Mary' from the database a second time will issue a second query in order locate the primary key of 'Mary', but will return the same object instance as the one already loaded. This behavior is due to an important property of the Session known as the identity map:

>>> mary2 = query.get_by(user_name='Mary')
>>> mary is mary2
True

With the identity map, a single Session can be relied upon to keep all loaded instances straight.

As far as the issue of the same object being modified in two different Sessions, that's an issue of concurrency detection; SQLAlchemy does some basic concurrency checks when saving objects, with the option for a stronger check using version ids. See None for more details.

back to section top
Saving

With a new user "ed" and some changes made on "Mary" and "Harry", lets also mark "Fred" as deleted:

>>> fred = query.get_by(user_name='Fred')
>>> session.delete(fred)

Then to send all of our changes to the database, we flush() the Session. Lets turn echo back on to see this happen!:

>>> metadata.engine.echo = True
>>> session.flush()
BEGIN
UPDATE users SET password=? WHERE users.user_id = ?
['marysnewpassword', 1]
UPDATE users SET password=? WHERE users.user_id = ?
['harrysnewpassword', 4]
INSERT INTO users (user_name, password) VALUES (?, ?)
['Ed', 'edspassword']
DELETE FROM users WHERE users.user_id = ?
[3]
COMMIT
back to section top
Relationships

When our User object contains relationships to other kinds of information, such as a list of email addresses, we can indicate this by using a function when creating the Mapper called relation(). While there is a lot you can do with relations, we'll cover a simple one here. First, recall that our users table has a foreign key relationship to another table called email_addresses. A single row in email_addresses has a column user_id that references a row in the users table; since many rows in the email_addresses table can reference a single row in users, this is called a one to many relationship.

First, deal with the email_addresses table by itself. We will create a new class Address which represents a single row in the email_addresses table, and a corresponding Mapper which will associate the Address class with the email_addresses table:

>>> class Address(object):
...     def __init__(self, email_address):
...         self.email_address = email_address
...     def __repr__(self):
...         return "(Address %s)" % (self.email_address)
>>> mapper(Address, email_addresses_table) 
<sqlalchemy.orm.mapper.Mapper object at 0x...>

Next, we associate the User and Address classes together by creating a relation using relation(), and then adding that relation to the User mapper, using the add_property function:

>>> usermapper.add_property('addresses', relation(Address))

The relation() function takes either a class or a Mapper as its first argument, and has many options to further control its behavior. The 'User' mapper has now placed additional property on each User instance called addresses. SQLAlchemy will automatically determine that this relationship is a one-to-many relationship, and will subsequently create addresses as a list. When a new User is created, this list will begin as empty.

Lets see what we get for the email addresses already in the database. Since we have made a change to the mapper's configuration, its best that we clear out our Session, which is currently holding onto every User object we have already loaded:

>>> session.clear()

We can then treat the addresses attribute on each User object like a regular list:

>>> mary = query.get_by(user_name='Mary') 
SELECT users.user_name AS users_user_name, users.password AS users_password, users.user_id AS users_user_id 
FROM users 
WHERE users.user_name = ? ORDER BY users.oid 
LIMIT 1 OFFSET 0
['Mary']
>>> print [a for a in mary.addresses]
SELECT email_addresses.user_id AS email_addresses_user_id, email_addresses.address_id AS email_addresses_address_id, email_addresses.email_address AS email_addresses_email_address 
FROM email_addresses 
WHERE ? = email_addresses.user_id ORDER BY email_addresses.oid
[1]
[(Address mary@mary.com)]

Adding to the list is just as easy. New Address objects will be detected and saved when we flush the Session:

>>> mary.addresses.append(Address('mary2@gmail.com'))
>>> session.flush() 
BEGIN
INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?)
['mary2@gmail.com', 1]
COMMIT

Main documentation for using mappers: Data Mapping

back to section top
Transactions

You may have noticed from the example above that when we say session.flush(), SQLAlchemy indicates the names BEGIN and COMMIT to indicate a transaction with the database. The flush() method, since it may execute many statements in a row, will automatically use a transaction in order to execute these instructions. But what if we want to use flush() inside of a larger transaction? This is performed via the SessionTransaction object, which we can establish using session.create_transaction(). Below, we will perform a more complicated SELECT statement, make several changes to our collection of users and email addresess, and then create a new user with two email addresses, within the context of a transaction. We will perform a flush() in the middle of it to write the changes we have so far, and then allow the remaining changes to be written when we finally commit() the transaction. We enclose our operations within a try/except block to insure that resources are properly freed:

>>> transaction = session.create_transaction()
>>> try: 
...     (ed, harry, mary) = session.query(User).select(
...         User.c.user_name.in_('Ed', 'Harry', 'Mary'), order_by=User.c.user_name
...     )
...     del mary.addresses[1]
...     harry.addresses.append(Address('harry2@gmail.com'))
...     session.flush()
...     print "***flushed the session***"
...     fred = User()
...     fred.user_name = 'fred_again'
...     fred.addresses.append(Address('fred@fred.com'))
...     fred.addresses.append(Address('fredsnewemail@fred.com'))
...     session.save(fred)
...     transaction.commit()
... except:
...     transaction.rollback()
...     raise
BEGIN
SELECT users.user_name AS users_user_name, users.password AS users_password, users.user_id AS users_user_id 
FROM users 
WHERE users.user_name IN (?, ?, ?) ORDER BY users.user_name
['Ed', 'Harry', 'Mary']
SELECT email_addresses.user_id AS email_addresses_user_id, email_addresses.address_id AS email_addresses_address_id, email_addresses.email_address AS email_addresses_email_address 
FROM email_addresses 
WHERE ? = email_addresses.user_id ORDER BY email_addresses.oid
[4]
UPDATE email_addresses SET user_id=? WHERE email_addresses.address_id = ?
[None, 3]
INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?)
['harry2@gmail.com', 4]
***flushed the session***    
INSERT INTO users (user_name, password) VALUES (?, ?)
['fred_again', None]
INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?)
['fred@fred.com', 6]
INSERT INTO email_addresses (email_address, user_id) VALUES (?, ?)
['fredsnewemail@fred.com', 6]
COMMIT

Main documentation: Session / Unit of Work

back to section top
Next Steps

That covers a quick tour through the basic idea of SQLAlchemy, in its simplest form. Beyond that, one should familiarize oneself with the basics of Sessions, the various patterns that can be used to define different kinds of Mappers and relations among them, the rudimentary SQL types that are available when constructing Tables, and the basics of Engines, SQL statements, and database Connections.

back to section top

A database engine is a subclass of sqlalchemy.sql.Engine, and is the starting point for where SQLAlchemy provides a layer of abstraction on top of the various DBAPI2 database modules. For all databases supported by SA, there is a specific "implementation" module, found in the sqlalchemy.databases package, that provides all the objects an Engine needs in order to perform its job. A typical user of SQLAlchemy never needs to deal with these modules directly. For many purposes, the only knowledge that's needed is how to create an Engine for a particular connection URL. When dealing with direct execution of SQL statements, one would also be aware of Result, Connection, and Transaction objects. The primary public facing objects are:

  • URL - represents the identifier for a particular database. URL objects are usually created automatically based on a given connect string passed to the create_engine() function.
  • Engine - Combines a connection-providing resource with implementation-provided objects that know how to generate, execute, and gather information about SQL statements. It also provides the primary interface by which Connections are obtained, as well as a context for constructed SQL objects and schema constructs to "implicitly execute" themselves, which is an optional feature of SA 0.2. The Engine object that is normally dealt with is an instance of sqlalchemy.engine.base.ComposedSQLEngine.
  • Connection - represents a connection to the database. The underlying connection object returned by a DBAPI's connect() method is referenced internally by the Connection object. Connection provides methods that handle the execution of SQLAlchemy's own SQL constructs, as well as literal string-based statements.
  • Transaction - represents a transaction on a single Connection. Includes begin(), commit() and rollback() methods that support basic "nestable" behavior, meaning an outermost transaction is maintained against multiple nested calls to begin/commit.
  • ResultProxy - Represents the results of an execution, and is most analgous to the cursor object in DBAPI. It primarily allows iteration over result sets, but also provides an interface to information about inserts/updates/deletes, such as the count of rows affected, last inserted IDs, etc.
  • RowProxy - Represents a single row returned by the fetchone() method on ResultProxy.

Underneath the public-facing API of ComposedSQLEngine, several components are provided by database implementations to provide the full behavior, including:

  • Dialect - this object is provided by database implementations to describe the behavior of a particular database. It acts as a repository for metadata about a database's characteristics, and provides factory methods for other objects that deal with generating SQL strings and objects that handle some of the details of statement execution.
  • ConnectionProvider - this object knows how to return a DBAPI connection object. It typically talks to a connection pool which maintains one or more connections in memory for quick re-use.
  • ExecutionContext - this object is created for each execution of a single SQL statement, and tracks information about its execution such as primary keys inserted, the total count of rows affected, etc. It also may implement any special logic that various DBAPI implementations may require before or after a statement execution.
  • Compiler - receives SQL expression objects and assembles them into strings that are suitable for direct execution, as well as collecting bind parameters into a dictionary or list to be sent along with the statement.
  • SchemaGenerator - receives collections of Schema objects and knows how to generate the appropriate SQL for CREATE and DROP statements.
Supported Databases

Engines exist for SQLite, Postgres, MySQL, and Oracle, using the Pysqlite, Psycopg2 (Psycopg1 will work to some degree but its typing model is not supported...install Psycopg2!), MySQLDB, and cx_Oracle modules. There is also preliminary support for MS-SQL using adodbapi or pymssql, as well as Firebird. For each engine, a distinct Python module exists in the sqlalchemy.databases package, which provides implementations of some of the objects mentioned in the previous section.

back to section top
Establishing a Database Engine

SQLAlchemy 0.2 indicates the source of an Engine strictly via RFC-1738 style URLs, combined with optional keyword arguments to specify options for the Engine. The form of the URL is:

$ driver://username:password@host:port/database

Available drivernames are sqlite, mysql, postgres, oracle, mssql, and firebird. For sqlite, the database name is the filename to connect to, or the special name ":memory:" which indicates an in-memory database. The URL is typically sent as a string to the create_engine() function:

pg_db = create_engine('postgres://scott:tiger@localhost:5432/mydatabase')
sqlite_db = create_engine('sqlite:///mydb.txt')
mysql_db = create_engine('mysql://localhost/foo')

# oracle via TNS name
oracle_db = create_engine('oracle://scott:tiger@dsn')

# oracle will feed host/port/SID into cx_oracle.makedsn
oracle_db = create_engine('oracle://scott:tiger@127.0.0.1:1521/sidname')

The Engine will create its first connection to the database when a SQL statement is executed. As concurrent statements are executed, the underlying connection pool will grow to a default size of five connections, and will allow a default "overflow" of ten. Since the Engine is essentially "home base" for the connection pool, it follows that you should keep a single Engine per database established within an application, rather than creating a new one for each connection.

back to section top
Database Engine Options

Keyword options can also be specified to create_engine(), following the string URL as follows:

db = create_engine('postgres://...', encoding='latin1', echo=True, module=psycopg1)

Options that can be specified include the following:

  • strategy='plain' : the Strategy describes the general configuration used to create this Engine. The two available values are plain, which is the default, and threadlocal, which applies a "thread-local context" to implicit executions performed by the Engine. This context is further described in Implicit Connection Contexts.
  • pool=None : an instance of sqlalchemy.pool.Pool to be used as the underlying source for connections, overriding the engine's connect arguments (pooling is described in Connection Pooling). If None, a default Pool (usually QueuePool, or SingletonThreadPool in the case of SQLite) will be created using the engine's connect arguments.

Example:

from sqlalchemy import *
import sqlalchemy.pool as pool
import MySQLdb

def getconn():
    return MySQLdb.connect(user='ed', dbname='mydb')

engine = create_engine('mysql', pool=pool.QueuePool(getconn, pool_size=20, max_overflow=40))
  • pool_size=5 : the number of connections to keep open inside the connection pool. This is only used with QueuePool.
  • max_overflow=10 : the number of connections to allow in "overflow", that is connections that can be opened above and beyond the initial five. this is only used with QueuePool.
  • pool_timeout=30 : number of seconds to wait before giving up on getting a connection from the pool. This is only used with QueuePool.
  • echo=False : if True, the Engine will log all statements as well as a repr() of their parameter lists to the engines logger, which defaults to sys.stdout. The echo attribute of ComposedSQLEngine can be modified at any time to turn logging on and off. If set to the string "debug", result rows will be printed to the standard output as well.
  • logger=None : a file-like object where logging output can be sent, if echo is set to True. Newlines will not be sent with log messages. This defaults to an internal logging object which references sys.stdout.
  • module=None : used by database implementations which support multiple DBAPI modules, this is a reference to a DBAPI2 module to be used instead of the engine's default module. For Postgres, the default is psycopg2, or psycopg1 if 2 cannot be found. For Oracle, its cx_Oracle.
  • use_ansi=True : used only by Oracle; when False, the Oracle driver attempts to support a particular "quirk" of Oracle versions 8 and previous, that the LEFT OUTER JOIN SQL syntax is not supported, and the "Oracle join" syntax of using &lt;column1&gt;(+)=&lt;column2&gt; must be used in order to achieve a LEFT OUTER JOIN.
  • threaded=True : used by cx_Oracle; sets the threaded parameter of the connection indicating thread-safe usage. cx_Oracle docs indicate setting this flag to False will speed performance by 10-15%. While this defaults to False in cx_Oracle, SQLAlchemy defaults it to True, preferring stability over early optimization.
  • use_oids=False : used only by Postgres, will enable the column name "oid" as the object ID column, which is also used for the default sort order of tables. Postgres as of 8.1 has object IDs disabled by default.
  • convert_unicode=False : if set to True, all String/character based types will convert Unicode values to raw byte values going into the database, and all raw byte values to Python Unicode coming out in result sets. This is an engine-wide method to provide unicode across the board. For unicode conversion on a column-by-column level, use the Unicode column type instead.
  • encoding='utf-8' : the encoding to use for all Unicode translations, both by engine-wide unicode conversion as well as the Unicode type object.
back to section top
Using Connections

In this section we describe the SQL execution interface available from an Engine instance. Note that when using the Object Relational Mapper (ORM) as well as when dealing with with "bound" metadata objects (described later), SQLAlchemy deals with the Engine for you and you generally don't need to know much about it; in those cases, you can skip this section and go to Database Meta Data.

The Engine provides a connect() method which returns a Connection object. This object provides methods by which literal SQL text as well as SQL clause constructs can be compiled and executed.

engine = create_engine('sqlite:///:memory:')
connection = engine.connect()
result = connection.execute("select * from mytable where col1=:col1", col1=5)
for row in result:
    print row['col1'], row['col2']
connection.close()

The close method on Connection does not actually remove the underlying connection to the database, but rather indicates that the underlying resources can be returned to the connection pool. When using the connect() method, the DBAPI connection referenced by the Connection object is not referenced anywhere else.

In both execution styles above, the Connection object will also automatically return its resources to the connection pool when the object is garbage collected, i.e. its __del__() method is called. When using the standard C implementation of Python, this method is usually called immediately as soon as the object is dereferenced. With other Python implementations such as Jython, this is not so guaranteed.

The execute method on Engine and Connection can also receive SQL clause constructs as well, which are described in Constructing SQL Queries via Python Expressions:

connection = engine.connect()
result = connection.execute(select([table1], table1.c.col1==5))
for row in result:
    print row['col1'], row['col2']
connection.close()

Both Connection and Engine fulfill an interface known as Connectable which specifies common functionality between the two objects, such as getting a Connection and executing queries. Therefore, most SQLAlchemy functions which take an Engine as a parameter with which to execute SQL will also accept a Connection:

Specify Engine or Connection
engine = create_engine('sqlite:///:memory:')

# specify some Table metadata
metadata = MetaData()
table = Table('sometable', metadata, Column('col1', Integer))

# create the table with the Engine
table.create(engine=engine)

# drop the table with a Connection off the Engine
connection = engine.connect()
table.drop(engine=connection)
Implicit Connection Contexts

An implicit connection refers to connections that are allocated by the Engine internally. There are two general cases when this occurs: when using the various execute() methods that are available off the Engine object itself, and when calling the execute() method on constructed SQL objects, which are described in None.

Implicit Connection
engine = create_engine('sqlite:///:memory:')
result = engine.execute("select * from mytable where col1=:col1", col1=5)
for row in result:
    print row['col1'], row['col2']
result.close()

When using implicit connections, the returned ResultProxy has a close() method which will return the resources used by the underlying Connection.

The strategy keyword argument to create_engine() affects the algorithm used to retreive the underlying DBAPI connection used by implicit executions. When set to plain, each implicit execution requests a unique connection from the connection pool, which is returned to the pool when the resulting ResultProxy falls out of scope (i.e. __del__() is called) or its close() method is called. If a second implicit execution occurs while the ResultProxy from the previous execution is still open, then a second connection is pulled from the pool.

When strategy is set to threadlocal, the Engine still checks out a connection which is closeable in the same manner via the ResultProxy, except the connection it checks out will be the same connection as one which is already checked out, assuming the operation is in the same thread. When all ResultProxy objects are closed, the connection is returned to the pool normally.

It is crucial to note that the plain and threadlocal contexts do not impact the connect() method on the Engine. connect() always returns a unique connection. Implicit connections use a different method off of Engine for their operations called contextual_connect().

The plain strategy is better suited to an application that insures the explicit releasing of the resources used by each execution. This is because each execution uses its own distinct connection resource, and as those resources remain open, multiple connections can be checked out from the pool quickly. Since the connection pool will block further requests when too many connections have been checked out, not keeping track of this can impact an application's stability.

Plain Strategy
db = create_engine('mysql://localhost/test', strategy='plain')

# execute one statement and receive results.  r1 now references a DBAPI connection resource.
r1 = db.execute("select * from table1")

# execute a second statement and receive results.  r2 now references a *second* DBAPI connection resource.
r2 = db.execute("select * from table2")
for row in r1:
    ...
for row in r2:
    ...
# release connection 1
r1.close()

# release connection 2
r2.close()

Advantages to plain include that connection resources are immediately returned to the connection pool, without any reliance upon the __del__() method; there is no chance of resources being left around by a Python implementation that doesn't necessarily call __del__() immediately.

The threadlocal strategy is better suited to a programming style which relies upon the __del__() method of Connection objects in order to return them to the connection pool, rather than explicitly issuing a close() statement upon the ResultProxy object. This is because all of the executions within a single thread will share the same connection, if one has already been checked out in the current thread. Using this style, an application will use only one connection per thread at most within the scope of all implicit executions.

Threadlocal Strategy
db = create_engine('mysql://localhost/test', strategy='threadlocal')

# execute one statement and receive results.  r1 now references a DBAPI connection resource.
r1 = db.execute("select * from table1")

# execute a second statement and receive results.  r2 now references the *same* resource as r1
r2 = db.execute("select * from table2")

for row in r1:
    ...
for row in r2:
    ...
# dereference r1.  the connection is still held by r2.
r1 = None

# dereference r2.  with no more references to the underlying connection resources, they
# are returned to the pool.
r2 = None

Advantages to threadlocal include that resources can be left to clean up after themselves, application code can be more minimal, its guaranteed that only one connection is used per thread, and there is no chance of a "connection pool block", which is when an execution hangs because the current thread has already checked out all remaining resources.

To get at the actual Connection object which is used by implicit executions, call the contextual_connection() method on Engine:

Contextual Connection
# threadlocal strategy
db = create_engine('mysql://localhost/test', strategy='threadlocal')

conn1 = db.contextual_connection()
conn2 = db.contextual_connection()

>>> assert conn1.connection is conn2.connection
True

When the plain strategy is used, the contextual_connection() method is synonymous with the connect() method; both return a distinct connection from the pool.

back to section top
Transactions

The Connection object provides a begin() method which returns a Transaction object. This object is usually used within a try/except clause so that it is guaranteed to rollback() or commit():

trans = connection.begin()
try:
    r1 = connection.execute(table1.select())
    connection.execute(table1.insert(), col1=7, col2='this is some data')
    trans.commit()
except:
    trans.rollback()
    raise

The Transaction object also handles "nested" behavior by keeping track of the outermost begin/commit pair. In this example, two functions both issue a transaction on a Connection, but only the outermost Transaction object actually takes effect when it is committed.

# method_a starts a transaction and calls method_b
def method_a(connection):
    trans = connection.begin() # open a transaction
    try:
        method_b(connection)
        trans.commit()  # transaction is committed here
    except:
        trans.rollback() # this rolls back the transaction unconditionally
        raise

# method_b also starts a transaction
def method_b(connection):
    trans = connection.begin() # open a transaction - this runs in the context of method_a's transaction
    try:
        connection.execute("insert into mytable values ('bat', 'lala')")
        connection.execute(mytable.insert(), col1='bat', col2='lala')
        trans.commit()  # transaction is not committed yet
    except:
        trans.rollback() # this rolls back the transaction unconditionally
        raise

# open a Connection and call method_a
conn = engine.connect()                
method_a(conn)
conn.close()

Above, method_a is called first, which calls connection.begin(). Then it calls method_b. When method_b calls connection.begin(), it just increments a counter that is decremented when it calls commit(). If either method_a or method_b calls rollback(), the whole transaction is rolled back. The transaction is not committed until method_a calls the commit() method.

Note that SQLAlchemy's Object Relational Mapper also provides a way to control transaction scope at a higher level; this is described in SessionTransaction.

back to section top

Describing Databases with MetaData

The core of SQLAlchemy's query and object mapping operations is database metadata, which are Python objects that describe tables and other schema-level objects. Metadata objects can be created by explicitly naming the various components and their properties, using the Table, Column, ForeignKey, Index, and Sequence objects imported from sqlalchemy.schema. There is also support for reflection, which means you only specify the name of the entities and they are recreated from the database automatically.

A collection of metadata entities is stored in an object aptly named MetaData. This object takes an optional name parameter:

from sqlalchemy import *

metadata = MetaData(name='my metadata')

Then to construct a Table, use the Table class:

users = Table('users', metadata, 
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(16), nullable = False),
    Column('email_address', String(60), key='email'),
    Column('password', String(20), nullable = False)
)

user_prefs = Table('user_prefs', metadata, 
    Column('pref_id', Integer, primary_key=True),
    Column('user_id', Integer, ForeignKey("users.user_id"), nullable=False),
    Column('pref_name', String(40), nullable=False),
    Column('pref_value', String(100))
)

The specific datatypes for each Column, such as Integer, String, etc. are described in The Types System, and exist within the module sqlalchemy.types as well as the global sqlalchemy namespace.

The MetaData object supports some handy methods, such as getting a list of Tables in the order (or reverse) of their dependency:

>>> for t in metadata.table_iterator(reverse=False):
...    print t.name
users
user_prefs

And Table provides an interface to the table's properties as well as that of its columns:

employees = Table('employees', metadata, 
    Column('employee_id', Integer, primary_key=True),
    Column('employee_name', String(60), nullable=False, key='name'),
    Column('employee_dept', Integer, ForeignKey("departments.department_id"))
)

# access the column "EMPLOYEE_ID":
employees.columns.employee_id

# or just
employees.c.employee_id

# via string
employees.c['employee_id']

# iterate through all columns
for c in employees.c:
    # ...

# get the table's primary key columns
for primary_key in employees.primary_key:
    # ...

# get the table's foreign key objects:
for fkey in employees.foreign_keys:
    # ...

# access the table's MetaData:
employees.metadata

# access the table's Engine, if its MetaData is bound:
employees.engine

# access a column's name, type, nullable, primary key, foreign key
employees.c.employee_id.name
employees.c.employee_id.type
employees.c.employee_id.nullable
employees.c.employee_id.primary_key
employees.c.employee_dept.foreign_key

# get the "key" of a column, which defaults to its name, but can 
# be any user-defined string:
employees.c.name.key

# access a column's table:
employees.c.employee_id.table is employees
>>> True

# get the table related by a foreign key
fcolumn = employees.c.employee_dept.foreign_key.column.table
Binding MetaData to an Engine

A MetaData object can be associated with one or more Engine instances. This allows the MetaData and the elements within it to perform operations automatically, using the connection resources of that Engine. This includes being able to "reflect" the columns of tables, as well as to perform create and drop operations without needing to pass an Engine or Connection around. It also allows SQL constructs to be created which know how to execute themselves (called "implicit execution").

To bind MetaData to a single Engine, use BoundMetaData:

engine = create_engine('sqlite://', **kwargs)

# create BoundMetaData from an Engine
meta = BoundMetaData(engine)

# create the Engine and MetaData in one step
meta = BoundMetaData('postgres://db/', **kwargs)

Another form of MetaData exists which allows connecting to any number of engines, within the context of the current thread. This is DynamicMetaData:

meta = DynamicMetaData()

meta.connect(engine)    # connect to an existing Engine

meta.connect('mysql://user@host/dsn')   # create a new Engine and connect

DynamicMetaData is ideal for applications that need to use the same set of Tables for many different database connections in the same process, such as a CherryPy web application which handles multiple application instances in one process.

back to section top
Using the global Metadata object

Some users prefer to create Table objects without specifying a MetaData object, having Tables scoped on an application-wide basis. For them the default_metadata object and the global_connect() function is supplied. default_metadata is simply an instance of DynamicMetaData that exists within the sqlalchemy namespace, and global_connect() is a synonym for default_metadata.connect(). Defining a Table that has no MetaData argument will automatically use this default metadata as follows:

from sqlalchemy import *

# a Table with just a name and its Columns
mytable = Table('mytable', 
    Column('col1', Integer, primary_key=True),
    Column('col2', String(40))
    )

# connect all the "anonymous" tables to a postgres uri in the current thread    
global_connect('postgres://foo:bar@lala/test')

# create all tables in the default metadata
default_metadata.create_all()

# the table is bound
mytable.insert().execute(col1=5, col2='some value')
back to section top
Reflecting Tables

Once you have a BoundMetaData or a connected DynamicMetaData, you can create Table objects without specifying their columns, just their names, using autoload=True:

>>> messages = Table('messages', meta, autoload = True)
>>> [c.name for c in messages.columns]
['message_id', 'message_name', 'date']

At the moment the Table is constructed, it will query the database for the columns and constraints of the messages table.

Note that if a reflected table has a foreign key referencing another table, then the metadata for the related table will be loaded as well, even if it has not been defined by the application:

>>> shopping_cart_items = Table('shopping_cart_items', meta, autoload = True)
>>> print shopping_cart_items.c.cart_id.table.name
shopping_carts

To get direct access to 'shopping_carts', simply instantiate it via the Table constructor. You'll get the same instance of the shopping cart Table as the one that is attached to shoppingcartitems:

>>> shopping_carts = Table('shopping_carts', meta)
>>> shopping_carts is shopping_cart_items.c.cart_id.table.name
True

This works because when the Table constructor is called for a particular name and MetaData object, if the table has already been created then the instance returned will be the same as the original. This is a singleton constructor:

>>> news_articles = Table('news', meta, 
... Column('article_id', Integer, primary_key = True),
... Column('url', String(250), nullable = False)
... )
>>> othertable = Table('news', meta)
>>> othertable is news_articles
True
back to section top
Specifying the Schema Name

Some databases support the concept of multiple schemas. A Table can reference this by specifying the schema keyword argument:

financial_info = Table('financial_info', meta,
    Column('id', Integer, primary_key=True),
    Column('value', String(100), nullable=False),
    schema='remote_banks'
)

Within the MetaData collection, this table will be identified by the combination of financial_info and remote_banks. If another table called financial_info is referenced without the remote_banks schema, it will refer to a different Table. ForeignKey objects can reference columns in this table using the form remote_banks.financial_info.id.

back to section top
Other Options

Tables may support database-specific options, such as MySQL's engine option that can specify "MyISAM", "InnoDB", and other backends for the table:

addresses = Table('engine_email_addresses', meta,
    Column('address_id', Integer, primary_key = True),
    Column('remote_user_id', Integer, ForeignKey(users.c.user_id)),
    Column('email_address', String(20)),
    mysql_engine='InnoDB'
)
back to section top
Creating and Dropping Database Tables

Creating and dropping individual tables can be done via the create() and drop() methods of Table; these methods take an optional engine parameter which references an Engine or a Connection. If not supplied, the Engine bound to the MetaData will be used, else an error is raised:

meta = BoundMetaData('sqlite:///:memory:')
employees = Table('employees', meta, 
    Column('employee_id', Integer, primary_key=True),
    Column('employee_name', String(60), nullable=False, key='name'),
    Column('employee_dept', Integer, ForeignKey("departments.department_id"))
)
sqlemployees.create()

drop() method:

sqlemployees.drop(engine=e)

Entire groups of Tables can be created and dropped directly from the MetaData object with create_all() and drop_all(), each of which take an optional engine keyword argument which can reference an Engine or a Connection, else the underlying bound Engine is used:

engine = create_engine('sqlite:///:memory:')

metadata = MetaData()

users = Table('users', metadata, 
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(16), nullable = False),
    Column('email_address', String(60), key='email'),
    Column('password', String(20), nullable = False)
)

user_prefs = Table('user_prefs', metadata, 
    Column('pref_id', Integer, primary_key=True),
    Column('user_id', Integer, ForeignKey("users.user_id"), nullable=False),
    Column('pref_name', String(40), nullable=False),
    Column('pref_value', String(100))
)

sqlmetadata.create_all(engine=engine)
back to section top
Column Defaults and OnUpdates

SQLAlchemy includes flexible constructs in which to create default values for columns upon the insertion of rows, as well as upon update. These defaults can take several forms: a constant, a Python callable to be pre-executed before the SQL is executed, a SQL expression or function to be pre-executed before the SQL is executed, a pre-executed Sequence (for databases that support sequences), or a "passive" default, which is a default function triggered by the database itself upon insert, the value of which can then be post-fetched by the engine, provided the row provides a primary key in which to call upon.

Pre-Executed Insert Defaults

A basic default is most easily specified by the "default" keyword argument to Column. This defines a value, function, or SQL expression that will be pre-executed to produce the new value, before the row is inserted:

# a function to create primary key ids
i = 0
def mydefault():
    global i
    i += 1
    return i

t = Table("mytable", meta, 
    # function-based default
    Column('id', Integer, primary_key=True, default=mydefault),

    # a scalar default
    Column('key', String(10), default="default")
)

The "default" keyword can also take SQL expressions, including select statements or direct function calls:

t = Table("mytable", meta, 
    Column('id', Integer, primary_key=True),

    # define 'create_date' to default to now()
    Column('create_date', DateTime, default=func.now()),

    # define 'key' to pull its default from the 'keyvalues' table
    Column('key', String(20), default=keyvalues.select(keyvalues.c.type='type1', limit=1))
    )

The "default" keyword argument is shorthand for using a ColumnDefault object in a column definition. This syntax is optional, but is required for other types of defaults, futher described below:

Column('mycolumn', String(30), ColumnDefault(func.get_data()))
back to section top
Pre-Executed OnUpdate Defaults

Similar to an on-insert default is an on-update default, which is most easily specified by the "onupdate" keyword to Column, which also can be a constant, plain Python function or SQL expression:

t = Table("mytable", meta, 
    Column('id', Integer, primary_key=True),

    # define 'last_updated' to be populated with current_timestamp (the ANSI-SQL version of now())
    Column('last_updated', DateTime, onupdate=func.current_timestamp()),
)

To use an explicit ColumnDefault object to specify an on-update, use the "for_update" keyword argument:

Column('mycolumn', String(30), ColumnDefault(func.get_data(), for_update=True))
back to section top
Inline Default Execution: PassiveDefault

A PassiveDefault indicates a column default or on-update value that is executed automatically by the database. This construct is used to specify a SQL function that will be specified as "DEFAULT" when creating tables, and also to indicate the presence of new data that is available to be "post-fetched" after an insert or update execution.

t = Table('test', meta, 
    Column('mycolumn', DateTime, PassiveDefault("sysdate"))
)

A create call for the above table will produce:

CREATE TABLE test (
    mycolumn datetime default sysdate
)

PassiveDefaults also send a message to the Engine that data is available after update or insert. The object-relational mapper system uses this information to post-fetch rows after insert or update, so that instances can be refreshed with the new data. Below is a simplified version:

# table with passive defaults
mytable = Table('mytable', engine, 
    Column('my_id', Integer, primary_key=True),

    # an on-insert database-side default
    Column('data1', Integer, PassiveDefault("d1_func")),

    # an on-update database-side default
    Column('data2', Integer, PassiveDefault("d2_func", for_update=True))
)
# insert a row
r = mytable.insert().execute(name='fred')

# check the result: were there defaults fired off on that row ?
if r.lastrow_has_defaults():
    # postfetch the row based on primary key.
    # this only works for a table with primary key columns defined
    primary_key = r.last_inserted_ids()
    row = table.select(table.c.id == primary_key[0])

When Tables are reflected from the database using autoload=True, any DEFAULT values set on the columns will be reflected in the Table object as PassiveDefault instances.

The Catch: Postgres Primary Key Defaults always Pre-Execute

Current Postgres support does not rely upon OID's to determine the identity of a row. This is because the usage of OIDs has been deprecated with Postgres and they are disabled by default for table creates as of PG version 8. Pyscopg2's "cursor.lastrowid" function only returns OIDs. Therefore, when inserting a new row which has passive defaults set on the primary key columns, the default function is still pre-executed since SQLAlchemy would otherwise have no way of retrieving the row just inserted.

back to section top
Defining Sequences

A table with a sequence looks like:

table = Table("cartitems", meta, 
    Column("cart_id", Integer, Sequence('cart_id_seq'), primary_key=True),
    Column("description", String(40)),
    Column("createdate", DateTime())
)

The Sequence is used with Postgres or Oracle to indicate the name of a database sequence that will be used to create default values for a column. When a table with a Sequence on a column is created in the database by SQLAlchemy, the database sequence object is also created. Similarly, the database sequence is dropped when the table is dropped. Sequences are typically used with primary key columns. When using Postgres, if an integer primary key column defines no explicit Sequence or other default method, SQLAlchemy will create the column with the SERIAL keyword, and will pre-execute a sequence named "tablenamecolumnnameseq" in order to retrieve new primary key values, if they were not otherwise explicitly stated. Oracle, which has no "auto-increment" keyword, requires that a Sequence be created for a table if automatic primary key generation is desired.

A Sequence object can be defined on a Table that is then used for a non-sequence-supporting database. In that case, the Sequence object is simply ignored. Note that a Sequence object is entirely optional for all databases except Oracle, as other databases offer options for auto-creating primary key values, such as AUTOINCREMENT, SERIAL, etc. SQLAlchemy will use these default methods for creating primary key values if no Sequence is present on the table metadata.

A sequence can also be specified with optional=True which indicates the Sequence should only be used on a database that requires an explicit sequence, and not those that supply some other method of providing integer values. At the moment, it essentially means "use this sequence only with Oracle and not Postgres".

back to section top
Defining Indexes

Indexes can be defined on table columns, including named indexes, non-unique or unique, multiple column. Indexes are included along with table create and drop statements. They are not used for any kind of run-time constraint checking; SQLAlchemy leaves that job to the expert on constraint checking, the database itself.

boundmeta = BoundMetaData('postgres:///scott:tiger@localhost/test')
mytable = Table('mytable', boundmeta, 
    # define a unique index 
    Column('col1', Integer, unique=True),

    # define a unique index with a specific name
    Column('col2', Integer, unique='mytab_idx_1'),

    # define a non-unique index
    Column('col3', Integer, index=True),

    # define a non-unique index with a specific name
    Column('col4', Integer, index='mytab_idx_2'),

    # pass the same name to multiple columns to add them to the same index
    Column('col5', Integer, index='mytab_idx_2'),

    Column('col6', Integer),
    Column('col7', Integer)
)

# create the table.  all the indexes will be created along with it.
mytable.create()

# indexes can also be specified standalone
i = Index('mytab_idx_3', mytable.c.col6, mytable.c.col7, unique=False)

# which can then be created separately (will also get created with table creates)
i.create()
back to section top
Adapting Tables to Alternate Metadata

A Table object created against a specific MetaData object can be re-created against a new MetaData using the tometadata method:

# create two metadata
meta1 = BoundMetaData('sqlite:///querytest.db')
meta2 = MetaData()

# load 'users' from the sqlite engine
users_table = Table('users', meta1, autoload=True)

# create the same Table object for the plain metadata
users_table_2 = users_table.tometadata(meta2)
back to section top

Note: This section describes how to use SQLAlchemy to construct SQL queries and receive result sets. It does not cover the object relational mapping capabilities of SQLAlchemy; that is covered later on in Data Mapping. However, both areas of functionality work similarly in how selection criterion is constructed, so if you are interested just in ORM, you should probably skim through basic WHERE Clause construction before moving on.

Once you have used the sqlalchemy.schema module to construct your tables and/or reflect them from the database, performing SQL queries using those table meta data objects is done via the sqlalchemy.sql package. This package defines a large set of classes, each of which represents a particular kind of lexical construct within a SQL query; all are descendants of the common base class sqlalchemy.sql.ClauseElement. A full query is represented via a structure of ClauseElements. A set of reasonably intuitive creation functions is provided by the sqlalchemy.sql package to create these structures; these functions are described in the rest of this section.

Executing a ClauseElement structure can be performed in two general ways. You can use an Engine or a Connection object's execute() method to which you pass the query structure; this is known as explicit style. Or, if the ClauseElement structure is built upon Table metadata which is bound to an Engine directly, you can simply call execute() on the structure itself, known as implicit style. In both cases, the execution returns a cursor-like object (more on that later). The same clause structure can be executed repeatedly. The ClauseElement is compiled into a string representation by an underlying Compiler object which is associated with the Engine via its Dialect.

The examples below all include a dump of the generated SQL corresponding to the query object, as well as a dump of the statement's bind parameters. In all cases, bind parameters are shown as named parameters using the colon format (i.e. ':name'). When the statement is compiled into a database-specific version, the named-parameter statement and its bind values are converted to the proper paramstyle for that database automatically.

For this section, we will mostly use the implcit style of execution, meaning the Table objects are associated with an instance of BoundMetaData, and constructed ClauseElement objects support self-execution. Assume the following configuration:

from sqlalchemy import *
metadata = BoundMetaData('sqlite:///mydb.db', strategy='threadlocal', echo=True)

# a table to store users
users = Table('users', metadata,
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(40)),
    Column('password', String(80))
)

# a table that stores mailing addresses associated with a specific user
addresses = Table('addresses', metadata,
    Column('address_id', Integer, primary_key = True),
    Column('user_id', Integer, ForeignKey("users.user_id")),
    Column('street', String(100)),
    Column('city', String(80)),
    Column('state', String(2)),
    Column('zip', String(10))
)

# a table that stores keywords
keywords = Table('keywords', metadata,
    Column('keyword_id', Integer, primary_key = True),
    Column('name', VARCHAR(50))
)

# a table that associates keywords with users
userkeywords = Table('userkeywords', metadata,
    Column('user_id', INT, ForeignKey("users")),
    Column('keyword_id', INT, ForeignKey("keywords"))
)
Simple Select

A select is done by constructing a Select object with the proper arguments, adding any extra arguments if desired, then calling its execute() method.

Basic Select
from sqlalchemy import *

# use the select() function defined in the sql package
s = select([users])

# or, call the select() method off of a Table object
s = users.select()

# then, call execute on the Select object:
sqlresult = s.execute()
# the SQL text of any clause object can also be viewed via the str() call:
>>> str(s)
SELECT users.user_id, users.user_name, users.password FROM users
Explicit Execution

As mentioned above, ClauseElement structures can also be executed with a Connection object explicitly:

engine = create_engine('sqlite:///myfile.db')
conn = engine.connect()

s = users.select()
sqlresult = conn.execute(s)
conn.close()
back to section top
Binding ClauseElements to Engines

For queries that don't contain any tables, ClauseElements that represent a fully executeable statement support an engine keyword parameter which can bind the object to an Engine, thereby allowing implicit execution:

# select a literal
sqlselect(["current_time"], engine=myengine).execute()
# select a function
sqlselect([func.now()], engine=db).execute()
back to section top
Getting Results

The object returned by execute() is a sqlalchemy.engine.ResultProxy object, which acts much like a DBAPI cursor object in the context of a result set, except that the rows returned can address their columns by ordinal position, column name, or even column object:

Using the ResultProxy
# select rows, get resulting ResultProxy object
sqlresult = users.select().execute()
# get one row
row = result.fetchone()

# get the 'user_id' column via integer index:
user_id = row[0]

# or column name
user_name = row['user_name']

# or column object
password = row[users.c.password]

# or column accessor
password = row.password

# ResultProxy object also supports fetchall()
rows = result.fetchall()

# or get the underlying DBAPI cursor object
cursor = result.cursor

# close the result.  If the statement was implicitly executed 
# (i.e. without an explicit Connection), this will
# return the underlying connection resources back to 
# the connection pool.  de-referencing the result
# will also have the same effect.  if an explicit Connection was 
# used, then close() does nothing.
result.close()
back to section top
Using Column Labels

A common need when writing statements that reference multiple tables is to create labels for columns, thereby separating columns from different tables with the same name. The Select construct supports automatic generation of column labels via the use_labels=True parameter:

use_labels Flag
sqlc = select([users, addresses], 
users.c.user_id==addresses.c.address_id, 
use_labels=True).execute()

The table name part of the label is affected if you use a construct such as a table alias:

use_labels with an Alias
person = users.alias('person')
sqlc = select([person, addresses], 
    person.c.user_id==addresses.c.address_id, 
    use_labels=True).execute()

Labels are also generated in such a way as to never go beyond 30 characters. Most databases support a limit on the length of symbols, such as Postgres, and particularly Oracle which has a rather short limit of 30:

use_labels Generates Abbreviated Labels
long_named_table = users.alias('this_is_the_person_table')
sqlc = select([person], use_labels=True).execute()

You can also specify custom labels on a per-column basis using the label() function:

label() Function on Column
sqlc = select([users.c.user_id.label('id'), 
           users.c.user_name.label('name')]).execute()
back to section top
Table/Column Specification

Calling select off a table automatically generates a column clause which includes all the table's columns, in the order they are specified in the source Table object.

But in addition to selecting all the columns off a single table, any set of columns can be specified, as well as full tables, and any combination of the two:

Specify Columns to Select
# individual columns
sqlc = select([users.c.user_id, users.c.user_name]).execute()
# full tables
sqlc = select([users, addresses]).execute()
# combinations
sqlc = select([users, addresses.c.zip]).execute()
back to section top
WHERE Clause

The WHERE condition is the named keyword argument whereclause, or the second positional argument to the select() constructor and the first positional argument to the select() method of Table.

WHERE conditions are constructed using column objects, literal values, and functions defined in the sqlalchemy.sql module. Column objects override the standard Python operators to provide clause compositional objects, which compile down to SQL operations:

Basic WHERE Clause
sqlc = users.select(users.c.user_id == 7).execute()

Notice that the literal value "7" was broken out of the query and placed into a bind parameter. Databases such as Oracle must parse incoming SQL and create a "plan" when new queries are received, which is an expensive process. By using bind parameters, the same query with various literal values can have its plan compiled only once, and used repeatedly with less overhead.

More where clauses:

# another comparison operator
sqlc = select([users], users.c.user_id>7).execute()
# OR keyword
sqlc = users.select(or_(users.c.user_name=='jack', users.c.user_name=='ed')).execute()
# AND keyword
sqlc = users.select(and_(users.c.user_name=='jack', users.c.password=='dog')).execute()
# NOT keyword
sqlc = users.select(not_(
        or_(users.c.user_name=='jack', users.c.password=='dog')
    )).execute()
# IN clause
sqlc = users.select(users.c.user_name.in_('jack', 'ed', 'fred')).execute()
# join users and addresses together
sqlc = select([users, addresses], users.c.user_id==addresses.c.address_id).execute()
# join users and addresses together, but dont specify "addresses" in the 
# selection criterion.  The WHERE criterion adds it to the FROM list 
# automatically.
sqlc = select([users], and_(
                users.c.user_id==addresses.c.user_id,
                users.c.user_name=='fred'
            )).execute()

Select statements can also generate a WHERE clause based on the parameters you give it. If a given parameter, which matches the name of a column or its "label" (the combined tablename + "_" + column name), and does not already correspond to a bind parameter in the select object, it will be added as a comparison against that column. This is a shortcut to creating a full WHERE clause:

# specify a match for the "user_name" column
sqlc = users.select().execute(user_name='ed')
# specify a full where clause for the "user_name" column, as well as a
# comparison for the "user_id" column
sqlc = users.select(users.c.user_name=='ed').execute(user_id=10)
Operators

Supported column operators so far are all the numerical comparison operators, i.e. '==', '>', '>=', etc., as well as like(), startswith(), endswith(), between(), and in(). Boolean operators include not_(), and() and or(), which also can be used inline via '~', '&', and '|'. Math operators are '+', '-', '*', '/'. Any custom operator can be specified via the op() function shown below.

# "like" operator
users.select(users.c.user_name.like('%ter'))

# equality operator
users.select(users.c.user_name == 'jane')

# in opertator
users.select(users.c.user_id.in_(1,2,3))

# and_, endswith, equality operators
users.select(and_(addresses.c.street.endswith('green street'),
                addresses.c.zip=='11234'))

# & operator subsituting for 'and_'
users.select(addresses.c.street.endswith('green street') & (addresses.c.zip=='11234'))

# + concatenation operator
select([users.c.user_name + '_name'])

# NOT operator
users.select(~(addresses.c.street == 'Green Street'))

# any custom operator
select([users.c.user_name.op('||')('_category')])

# "null" comparison via == (converts to IS)
sqlusers.select(users.c.user_name==None).execute()
# or via explicit null() construct
sqlusers.select(users.c.user_name==null()).execute()
back to section top
Functions

Functions can be specified using the func keyword:

sqlselect([func.count(users.c.user_id)]).execute()
sqlusers.select(func.substr(users.c.user_name, 1) == 'J').execute()

Functions also are callable as standalone values:

# call the "now()" function
time = func.now(engine=myengine).scalar()

# call myfunc(1,2,3)
myvalue = func.myfunc(1, 2, 3, engine=db).execute()

# or call them off the engine
db.func.now().scalar()
back to section top
Literals

You can drop in a literal value anywhere there isnt a column to attach to via the literal keyword:

sqlselect([literal('foo') + literal('bar'), users.c.user_name]).execute()
# literals have all the same comparison functions as columns
sqlselect([literal('foo') == literal('bar')], engine=myengine).scalar()

Literals also take an optional type parameter to give literals a type. This can sometimes be significant, for example when using the "+" operator with SQLite, the String type is detected and the operator is converted to "||":

sqlselect([literal('foo', type=String) + 'bar'], engine=e).execute()
back to section top
Order By

The ORDER BY clause of a select statement can be specified as individual columns to order by within an array specified via the order_by parameter, and optional usage of the asc() and desc() functions:

# straight order by
sqlc = users.select(order_by=[users.c.user_name]).execute()
# descending/ascending order by on multiple columns
sqlc = users.select(
    users.c.user_name>'J', 
    order_by=[desc(users.c.user_id), asc(users.c.user_name)]).execute()
back to section top
DISTINCT, LIMIT and OFFSET

These are specified as keyword arguments:

sqlc = select([users.c.user_name], distinct=True).execute()
sqlc = users.select(limit=10, offset=20).execute()

The Oracle driver does not support LIMIT and OFFSET directly, but instead wraps the generated query into a subquery and uses the "rownum" variable to control the rows selected (this is somewhat experimental).

back to section top
Inner and Outer Joins

As some of the examples indicated above, a regular inner join can be implicitly stated, just like in a SQL expression, by just specifying the tables to be joined as well as their join conditions:

sqladdresses.select(addresses.c.user_id==users.c.user_id).execute()

There is also an explicit join constructor, which can be embedded into a select query via the from_obj parameter of the select statement:

sqladdresses.select(from_obj=[
    addresses.join(users, addresses.c.user_id==users.c.user_id)
]).execute()

The join constructor can also be used by itself:

sqljoin(users, addresses, users.c.user_id==addresses.c.user_id).select().execute()

The join criterion in a join() call is optional. If not specified, the condition will be derived from the foreign key relationships of the two tables. If no criterion can be constructed, an exception will be raised.

sqljoin(users, addresses).select().execute()

Notice that this is the first example where the FROM criterion of the select statement is explicitly specified. In most cases, the FROM criterion is automatically determined from the columns requested as well as the WHERE clause. The from_obj keyword argument indicates a list of explicit FROM clauses to be used in the statement.

A join can be created on its own using the join or outerjoin functions, or can be created off of an existing Table or other selectable unit via the join or outerjoin methods:

sqlouterjoin(users, addresses, 
           users.c.user_id==addresses.c.address_id).select().execute()
sqlusers.select(keywords.c.name=='running', from_obj=[
        users.join(
            userkeywords, userkeywords.c.user_id==users.c.user_id).join(
                keywords, keywords.c.keyword_id==userkeywords.c.keyword_id)
        ]).execute()
back to section top
Table Aliases

Aliases are used primarily when you want to use the same table more than once as a FROM expression in a statement:

address_b = addresses.alias('addressb')
sql# select users who have an address on Green street as well as Orange street
users.select(and_(
    users.c.user_id==addresses.c.user_id,
    addresses.c.street.like('%Green%'),
    users.c.user_id==address_b.c.user_id,
    address_b.c.street.like('%Orange%')
)).execute()
back to section top
Subqueries

SQLAlchemy allows the creation of select statements from not just Table objects, but from a whole class of objects that implement the Selectable interface. This includes Tables, Aliases, Joins and Selects. Therefore, if you have a Select, you can select from the Select:

>>> s = users.select()
>>> str(s)
SELECT users.user_id, users.user_name, users.password FROM users

>>> s = s.select()
>>> str(s)
SELECT user_id, user_name, password
FROM (SELECT users.user_id, users.user_name, users.password FROM users)

Any Select, Join, or Alias object supports the same column accessors as a Table:

>>> s = users.select()
>>> [c.key for c in s.columns]
['user_id', 'user_name', 'password']

When you use use_labels=True in a Select object, the label version of the column names become the keys of the accessible columns. In effect you can create your own "view objects":

s = select([users, addresses], users.c.user_id==addresses.c.user_id, use_labels=True)
sqlselect([
    s.c.users_user_name, s.c.addresses_street, s.c.addresses_zip
], s.c.addresses_city=='San Francisco').execute()

To specify a SELECT statement as one of the selectable units in a FROM clause, it usually should be given an alias.

sqls = users.select().alias('u')
select([addresses, s]).execute()

Select objects can be used in a WHERE condition, in operators such as IN:

# select user ids for all users whos name starts with a "p"
s = select([users.c.user_id], users.c.user_name.like('p%'))

# now select all addresses for those users
sqladdresses.select(addresses.c.user_id.in_(s)).execute()

The sql package supports embedding select statements into other select statements as the criterion in a WHERE condition, or as one of the "selectable" objects in the FROM list of the query. It does not at the moment directly support embedding a SELECT statement as one of the column criterion for a statement, although this can be achieved via direct text insertion, described later.

Scalar Column Subqueries

Subqueries can be used in the column clause of a select statement by specifying the scalar=True flag:

sqlselect([table2.c.col1, table2.c.col2, 
                select([table1.c.col1], table1.c.col2==7, scalar=True)])
back to section top
Correlated Subqueries

When a select object is embedded inside of another select object, and both objects reference the same table, SQLAlchemy makes the assumption that the table should be correlated from the child query to the parent query. To disable this behavior, specify the flag correlate=False to the Select statement.

# make an alias of a regular select.   
s = select([addresses.c.street], addresses.c.user_id==users.c.user_id).alias('s')
>>> str(s)
SELECT addresses.street FROM addresses, users 
WHERE addresses.user_id = users.user_id

# now embed that select into another one.  the "users" table is removed from
# the embedded query's FROM list and is instead correlated to the parent query
s2 = select([users, s.c.street])
>>> str(s2)
SELECT users.user_id, users.user_name, users.password, s.street
FROM users, (SELECT addresses.street FROM addresses
WHERE addresses.user_id = users.user_id) s
EXISTS Clauses

An EXISTS clause can function as a higher-scaling version of an IN clause, and is usually used in a correlated fashion:

# find all users who have an address on Green street:
sqlusers.select(
    exists(
        [addresses.c.address_id], 
        and_(
            addresses.c.user_id==users.c.user_id, 
            addresses.c.street.like('%Green%')
        )
    )
)
back to section top
Unions

Unions come in two flavors, UNION and UNION ALL, which are available via module level functions or methods off a Selectable:

sqlunion(
    addresses.select(addresses.c.street=='123 Green Street'),
    addresses.select(addresses.c.street=='44 Park Ave.'),
    addresses.select(addresses.c.street=='3 Mill Road'),
    order_by=[addresses.c.street]
).execute()
sqlusers.select(
    users.c.user_id==7
  ).union_all(
      users.select(
          users.c.user_id==9
      ), 
      order_by=[users.c.user_id]   # order_by is an argument to union_all()
  ).execute()
back to section top
Custom Bind Parameters

Throughout all these examples, SQLAlchemy is busy creating bind parameters wherever literal expressions occur. You can also specify your own bind parameters with your own names, and use the same statement repeatedly. The bind parameters, shown here in the "named" format, will be converted to the appropriate named or positional style according to the database implementation being used.

Custom Bind Params
s = users.select(users.c.user_name==bindparam('username'))

# execute implicitly
sqls.execute(username='fred')
# execute explicitly
conn = engine.connect()
sqlconn.execute(s, username='fred')

executemany() is also available by supplying multiple dictionary arguments instead of keyword arguments to the execute() method of ClauseElement or Connection. Examples can be found later in the sections on INSERT/UPDATE/DELETE.

Precompiling a Query

By throwing the compile() method onto the end of any query object, the query can be "compiled" by the SQLEngine into a sqlalchemy.sql.Compiled object just once, and the resulting compiled object reused, which eliminates repeated internal compilation of the SQL string:

s = users.select(users.c.user_name==bindparam('username')).compile()
s.execute(username='fred')
s.execute(username='jane')
s.execute(username='mary')
back to section top
Literal Text Blocks

The sql package tries to allow free textual placement in as many ways as possible. In the examples below, note that the from_obj parameter is used only when no other information exists within the select object with which to determine table metadata. Also note that in a query where there isnt even table metadata used, the SQLEngine to be used for the query has to be explicitly specified:

# strings as column clauses
sqlselect(["user_id", "user_name"], from_obj=[users]).execute()
# strings for full column lists
sqlselect(
        ["user_id, user_name, password, addresses.*"], 
        from_obj=[users.alias('u'), addresses]).execute()
# functions, etc.
sqlselect([users.c.user_id, "process_string(user_name)"]).execute()
# where clauses
sqlusers.select(and_(users.c.user_id==7, "process_string(user_name)=27")).execute()
# subqueries
sqlusers.select(
    "exists (select 1 from addresses where addresses.user_id=users.user_id)").execute()
# custom FROM objects
sqlselect(
        ["*"], 
        from_obj=["(select user_id, user_name from users)"], 
        engine=db).execute()
# a full query
sqltext("select user_name from users", engine=db).execute()
Using Bind Parameters in Text Blocks

Use the format ':paramname' to define bind parameters inside of a text block. They will be converted to the appropriate format upon compilation:

t = engine.text("select foo from mytable where lala=:hoho")
r = t.execute(hoho=7)

Bind parameters can also be explicit, which allows typing information to be added. Just specify them as a list with keys that match those inside the textual statement:

t = engine.text("select foo from mytable where lala=:hoho", 
                bindparams=[bindparam('hoho', type=types.String)])
    r = t.execute(hoho="im hoho")

Result-row type processing can be added via the typemap argument, which is a dictionary of return columns mapped to types:

# specify DateTime type for the 'foo' column in the result set
# sqlite, for example, uses result-row post-processing to construct dates
t = engine.text("select foo from mytable where lala=:hoho", 
        bindparams=[bindparam('hoho', type=types.String)],
        typemap={'foo':types.DateTime}
        )
r = t.execute(hoho="im hoho")

# 'foo' is a datetime
year = r.fetchone()['foo'].year
back to section top
Building Select Objects

One of the primary motivations for a programmatic SQL library is to allow the piecemeal construction of a SQL statement based on program variables. All the above examples typically show Select objects being created all at once. The Select object also includes "builder" methods to allow building up an object. The below example is a "user search" function, where users can be selected based on primary key, user name, street address, keywords, or any combination:

def find_users(id=None, name=None, street=None, keywords=None):
    statement = users.select()
    if id is not None:
        statement.append_whereclause(users.c.user_id==id)
    if name is not None:
        statement.append_whereclause(users.c.user_name==name)
    if street is not None:
        # append_whereclause joins "WHERE" conditions together with AND
        statement.append_whereclause(users.c.user_id==addresses.c.user_id)
        statement.append_whereclause(addresses.c.street==street)
    if keywords is not None:
        statement.append_from(
                users.join(userkeywords, users.c.user_id==userkeywords.c.user_id).join(
                        keywords, userkeywords.c.keyword_id==keywords.c.keyword_id))
        statement.append_whereclause(keywords.c.name.in_(keywords))
        # to avoid multiple repeats, set query to be DISTINCT:
        statement.distinct=True
    return statement.execute()

sqlfind_users(id=7)
sqlfind_users(street='123 Green Street')
sqlfind_users(name='Jack', keywords=['jack','foo'])
back to section top
Inserts

An INSERT involves just one table. The Insert object is used via the insert() function, and the specified columns determine what columns show up in the generated SQL. If primary key columns are left out of the criterion, the SQL generator will try to populate them as specified by the particular database engine and sequences, i.e. relying upon an auto-incremented column or explicitly calling a sequence beforehand. Insert statements, as well as updates and deletes, can also execute multiple parameters in one pass via specifying an array of dictionaries as parameters.

The values to be populated for an INSERT or an UPDATE can be specified to the insert()/update() functions as the values named argument, or the query will be compiled based on the values of the parameters sent to the execute() method.

Using insert()
# basic insert
sqlusers.insert().execute(user_id=1, user_name='jack', password='asdfdaf')
# insert just user_name, NULL for others
# will auto-populate primary key columns if they are configured
# to do so
sqlusers.insert().execute(user_name='ed')
# INSERT with a list:
sqlusers.insert(values=(3, 'jane', 'sdfadfas')).execute()
# INSERT with user-defined bind parameters
i = users.insert(
    values={'user_name':bindparam('name'), 'password':bindparam('pw')}
    )
sqli.execute(name='mary', pw='adas5fs')
# INSERT many - if no explicit 'values' parameter is sent,
# the first parameter list in the list determines
# the generated SQL of the insert (i.e. what columns are present)
# executemany() is used at the DBAPI level
sqlusers.insert().execute(
    {'user_id':7, 'user_name':'jack', 'password':'asdfasdf'},
    {'user_id':8, 'user_name':'ed', 'password':'asdffcadf'},
    {'user_id':9, 'user_name':'fred', 'password':'asttf'},
)
back to section top
Updates

Updates work a lot like INSERTS, except there is an additional WHERE clause that can be specified.

Using update()
# change 'jack' to 'ed'
sqlusers.update(users.c.user_name=='jack').execute(user_name='ed')
# use bind parameters
u = users.update(users.c.user_name==bindparam('name'), 
                values={'user_name':bindparam('newname')})
sqlu.execute(name='jack', newname='ed')
# update a column to another column
sqlusers.update(values={users.c.password:users.c.user_name}).execute()
# multi-update
sqlusers.update(users.c.user_id==bindparam('id')).execute(
        {'id':7, 'user_name':'jack', 'password':'fh5jks'},
        {'id':8, 'user_name':'ed', 'password':'fsr234ks'},
        {'id':9, 'user_name':'mary', 'password':'7h5jse'},
    )
Correlated Updates

A correlated update lets you update a table using selection from another table, or the same table:

s = select([addresses.c.city], addresses.c.user_id==users.c.user_id)
sqlusers.update(
    and_(users.c.user_id>10, users.c.user_id<20), 
    values={users.c.user_name:s}
).execute()
back to section top
Deletes

A delete is formulated like an update, except theres no values:

users.delete(users.c.user_id==7).execute()
users.delete(users.c.user_name.like(bindparam('name'))).execute(
        {'name':'%Jack%'},
        {'name':'%Ed%'},
        {'name':'%Jane%'},
    )
users.delete(exists())
back to section top

Basic Data Mapping

Data mapping describes the process of defining Mapper objects, which associate table metadata with user-defined classes.

The Mapper's role is to perform SQL operations upon the database, associating individual table rows with instances of those classes, and individual database columns with properties upon those instances, to transparently associate in-memory objects with a persistent database representation.

When a Mapper is created to associate a Table object with a class, all of the columns defined in the Table object are associated with the class via property accessors, which add overriding functionality to the normal process of setting and getting object attributes. These property accessors keep track of changes to object attributes; these changes will be stored to the database when the application "flushes" the current state of objects (known as a Unit of Work).

Two objects provide the primary interface for interacting with Mappers and the "unit of work" in SA 0.2, which are the Query object and the Session object. Query deals with selecting objects from the database, whereas Session provides a context for loaded objects and the ability to communicate changes on those objects back to the database.

The primary method on Query for loading objects is its select() method, which has similar arguments to a sqlalchemy.sql.Select object. But this select method executes automatically and returns results, instead of awaiting an execute() call. Instead of returning a cursor-like object, it returns an array of objects.

The three configurational elements to be defined, i.e. the Table metadata, the user-defined class, and the Mapper, are typically defined as module-level variables, and may be defined in any fashion suitable to the application, with the only requirement being that the class and table metadata are described before the mapper. For the sake of example, we will be defining these elements close together, but this should not be construed as a requirement; since SQLAlchemy is not a framework, those decisions are left to the developer or an external framework.

Also, keep in mind that the examples in this section deal with explicit Session objects mapped directly to Engine objects, which represents the most explicit style of using the ORM. Options exist for how this is configured, including binding Table objects directly to Engines (described in Binding MetaData to an Engine), as well as using the "Threadlocal" plugin which provides various code shortcuts by using an implicit Session associated to the current thread (described in threadlocal).

back to section top
Synopsis

First, the metadata/mapper configuration code:

from sqlalchemy import *

# metadata
meta = MetaData()

# table object
users_table = Table('users', meta, 
    Column('user_id', Integer, primary_key=True),
    Column('user_name', String(16)),
    Column('password', String(20))
)

# class definition 
class User(object):
    pass

# create a mapper and associate it with the User class.
# technically we dont really need the 'usermapper' variable.
usermapper = mapper(User, users_table)

Note that no database definitions are required. Next we will define an Engine and connect a Session to it, and perform a simple select:

# engine
engine = create_engine("sqlite://mydb.db")

# session
session = create_session(bind_to=engine)

# select
sqluser = session.query(User).select_by(user_name='fred')[0]
# modify
user.user_name = 'fred jones'

# flush - saves everything that changed
sqlsession.flush()
back to section top
The Query Object

The method session.query(class_or_mapper) returns a Query object. Below is a synopsis of things you can do with Query:

# get a query from a Session based on class:
query = session.query(User)

# get a query from a Session given a Mapper:
query = session.query(usermapper)

# select_by, which takes keyword arguments.  the
# keyword arguments represent property names and the values
# represent values which will be compared via the = operator.
# the comparisons are joined together via "AND".
result = query.select_by(name='john', street='123 green street')

# select_by can also combine ClauseElements with key/value properties.
# all ClauseElements and keyword-based criterion are combined together
# via "AND". 
result = query.select_by(users_table.c.user_name=='john', 
        addresses_table.c.zip_code=='12345', street='123 green street')

# get_by, which takes the same arguments as select_by
# returns a single scalar result or None if no results
user = query.get_by(id=12)

# "dynamic" versions of select_by and get_by - everything past the 
# "select_by_" or "get_by_" is used as the key, and the function argument
# as the value
result = query.select_by_name('fred')
u = query.get_by_name('fred')

# get an object directly from its primary key.  this will bypass the SQL
# call if the object has already been loaded
u = query.get(15)

# get an object that has a composite primary key of three columns.
# the order of the arguments matches that of the table meta data.
myobj = query.get((27, 3, 'receipts'))

# using a WHERE criterion
result = query.select(or_(users_table.c.user_name == 'john', users_table.c.user_name=='fred'))

# using a WHERE criterion to get a scalar
u = query.selectfirst(users_table.c.user_name=='john')

# selectone() is a stricter version of selectfirst() which
# will raise an exception if there is not exactly one row
u = query.selectone(users_table.c.user_name=='john')

# using a full select object
result = query.select(users_table.select(users_table.c.user_name=='john'))

Some of the above examples above illustrate the usage of the mapper's Table object to provide the columns for a WHERE Clause. These columns are also accessible off of the mapped class directly. When a mapper is assigned to a class, it also attaches a special property accessor c to the class itself, which can be used just like the table metadata to access the columns of the table:

userlist = session.query(User).select(User.c.user_id==12)
back to section top
Saving Objects

When objects corresponding to mapped classes are created or manipulated, all changes are logged by the Session object. The changes are then written to the database when an application calls flush(). This pattern is known as a Unit of Work, and has many advantages over saving individual objects or attributes on those objects with individual method invocations. Domain models can be built with far greater complexity with no concern over the order of saves and deletes, excessive database round-trips and write operations, or deadlocking issues. The flush() operation batches its SQL statements into a transaction, and can also perform optimistic concurrency checks (using a version id column) to insure the proper number of rows were in fact affected (not supported with the current MySQL drivers).

The Unit of Work is a powerful tool, and has some important concepts that should be understood in order to use it effectively. See the Session / Unit of Work section for a full description on all its operations.

When a mapper is created, the target class has its mapped properties decorated by specialized property accessors that track changes. New objects by default must be explicitly added to the Session, however this can be made automatic by using threadlocal or SessionContext.

mapper(User, users_table)

# create a new User
myuser = User()
myuser.user_name = 'jane'
myuser.password = 'hello123'

# create another new User      
myuser2 = User()
myuser2.user_name = 'ed'
myuser2.password = 'lalalala'

# create a Session and save them
sess = create_session()
sess.save(myuser)
sess.save(myuser2)

# load a third User from the database            
sqlmyuser3 = sess.query(User).select(User.c.user_name=='fred')[0]
myuser3.user_name = 'fredjones'

# save all changes            
sqlsession.flush()

The mapped class can also specify whatever methods and/or constructor it wants:

class User(object):
    def __init__(self, user_name, password):
        self.user_id = None
        self.user_name = user_name
        self.password = password
    def get_name(self):
        return self.user_name
    def __repr__(self):
        return "User id %s name %s password %s" % (repr(self.user_id), 
            repr(self.user_name), repr(self.password))
mapper(User, users_table)

sess = create_session()
u = User('john', 'foo')
sess.save(u)
sqlsession.flush()
>>> u
User id 1 name 'john' password 'foo'

SQLAlchemy will only put modified object attributes columns into the UPDATE statements generated upon flush. This is to conserve database traffic and also to successfully interact with a "deferred" attribute, which is a mapped object attribute against the mapper's primary table that isnt loaded until referenced by the application.

back to section top
Defining and Using Relationships

So that covers how to map the columns in a table to an object, how to load objects, create new ones, and save changes. The next step is how to define an object's relationships to other database-persisted objects. This is done via the relation function provided by the orm module.

One to Many

So with our User class, lets also define the User has having one or more mailing addresses. First, the table metadata:

from sqlalchemy import *

metadata = MetaData()

# define user table
users_table = Table('users', metadata, 
    Column('user_id', Integer, primary_key=True),
    Column('user_name', String(16)),
    Column('password', String(20))
)

# define user address table
addresses_table = Table('addresses', metadata,
    Column('address_id', Integer, primary_key=True),
    Column('user_id', Integer, ForeignKey("users.user_id")),
    Column('street', String(100)),
    Column('city', String(80)),
    Column('state', String(2)),
    Column('zip', String(10))
)

Of importance here is the addresses table's definition of a foreign key relationship to the users table, relating the user_id column into a parent-child relationship. When a Mapper wants to indicate a relation of one object to another, the ForeignKey relationships are the default method by which the relationship is determined (options also exist to describe the relationships explicitly).

So then lets define two classes, the familiar User class, as well as an Address class:

class User(object):
    def __init__(self, user_name, password):
        self.user_name = user_name
        self.password = password

class Address(object):
    def __init__(self, street, city, state, zip):
        self.street = street
        self.city = city
        self.state = state
        self.zip = zip

And then a Mapper that will define a relationship of the User and the Address classes to each other as well as their table metadata. We will add an additional mapper keyword argument properties which is a dictionary relating the names of class attributes to database relationships, in this case a relation object against a newly defined mapper for the Address class:

mapper(Address, addresses_table)
mapper(User, users_table, properties = {
        'addresses' : relation(Address)
    }
  )

Lets do some operations with these classes and see what happens:

engine = create_engine('sqlite:///mydb.db')
metadata.create_all(engine)

session = create_session(bind_to=engine)

u = User('jane', 'hihilala')
u.addresses.append(Address('123 anywhere street', 'big city', 'UT', '76543'))
u.addresses.append(Address('1 Park Place', 'some other city', 'OK', '83923'))

session.save(u)
session.flush()
INSERT INTO users (user_name, password) VALUES (:user_name, :password)
{'password': 'hihilala', 'user_name': 'jane'}
INSERT INTO addresses (user_id, street, city, state, zip) VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'big city', 'state': 'UT', 'street': '123 anywhere street', 'user_id':1, 'zip': '76543'}
INSERT INTO addresses (user_id, street, city, state, zip) VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'some other city', 'state': 'OK', 'street': '1 Park Place', 'user_id':1, 'zip': '83923'}

A lot just happened there! The Mapper figured out how to relate rows in the addresses table to the users table, and also upon flush had to determine the proper order in which to insert rows. After the insert, all the User and Address objects have their new primary and foreign key attributes populated.

Also notice that when we created a Mapper on the User class which defined an addresses relation, the newly created User instance magically had an "addresses" attribute which behaved like a list. This list is in reality a property function which returns an instance of sqlalchemy.util.HistoryArraySet. This object fulfills the full set of Python list accessors, but maintains a unique set of objects (based on their in-memory identity), and also tracks additions and deletions to the list:

del u.addresses[1]
u.addresses.append(Address('27 New Place', 'Houston', 'TX', '34839'))

session.flush()
UPDATE addresses SET user_id=:user_id
WHERE addresses.address_id = :addresses_address_id
[{'user_id': None, 'addresses_address_id': 2}]
INSERT INTO addresses (user_id, street, city, state, zip)
VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'Houston', 'state': 'TX', 'street': '27 New Place', 'user_id': 1, 'zip': '34839'}

Note that when creating a relation with the relation() function, the target can either be a class, in which case the primary mapper for that class is used as the target, or a Mapper instance itself, as returned by the mapper() function.

back to section top
Lifecycle Relations

In the previous example, a single address was removed from the addresses attribute of a User object, resulting in the corresponding database row being updated to have a user_id of None. But now, theres a mailing address with no user_id floating around in the database of no use to anyone. How can we avoid this ? This is acheived by using the cascade parameter of relation:

clear_mappers()  # clear mappers from the previous example
mapper(Address, addresses_table)
mapper(User, users_table, properties = {
        'addresses' : relation(Address, cascade="all, delete-orphan")
    }
  )

del u.addresses[1]
u.addresses.append(Address('27 New Place', 'Houston', 'TX', '34839'))

session.flush()
INSERT INTO addresses (user_id, street, city, state, zip)
VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'Houston', 'state': 'TX', 'street': '27 New Place', 'user_id': 1, 'zip': '34839'}
DELETE FROM addresses WHERE addresses.address_id = :address_id
[{'address_id': 2}]

In this case, with the delete-orphan cascade rule set, the element that was removed from the addresses list was also removed from the database. Specifying cascade="all, delete-orphan" means that every persistence operation performed on the parent object will be cascaded to the child object or objects handled by the relation, and additionally that each child object cannot exist without being attached to a parent. Such a relationship indicates that the lifecycle of the Address objects are bounded by that of their parent User object.

Cascading is described fully in Cascade rules.

back to section top
Backreferences

By creating relations with the backref keyword, a bi-directional relationship can be created which will keep both ends of the relationship updated automatically, independently of database operations. Below, the User mapper is created with an addresses property, and the corresponding Address mapper receives a "backreference" to the User object via the property name user:

Address = mapper(Address, addresses_table)
User = mapper(User, users_table, properties = {
                'addresses' : relation(Address, backref='user')
            }
          )

u = User('fred', 'hi')
a1 = Address('123 anywhere street', 'big city', 'UT', '76543')
a2 = Address('1 Park Place', 'some other city', 'OK', '83923')

# append a1 to u
u.addresses.append(a1)

# attach u to a2
a2.user = u

# the bi-directional relation is maintained
>>> u.addresses == [a1, a2]
True
>>> a1.user is user and a2.user is user
True

The backreference feature also works with many-to-many relationships, which are described later. When creating a backreference, a corresponding property (i.e. a second relation()) is placed on the child mapper. The default arguments to this property can be overridden using the backref() function:

mapper(User, users_table)
mapper(Address, addresses_table, properties={
    'user':relation(User, backref=backref('addresses', cascade="all, delete-orphan"))
})

The backref() function is often used to set up a bi-directional one-to-one relationship. This is because the relation() function by default creates a "one-to-many" relationship when presented with a primary key/foreign key relationship, but the backref() function can redefine the uselist property to make it a scalar:

mapper(User, users_table)
mapper(Address, addresses_table, properties={
    'user' : relation(User, backref=backref('address', uselist=False))
})
back to section top
Selecting from Relationships

We've seen how the relation specifier affects the saving of an object and its child items, how does it affect selecting them? By default, the relation keyword indicates that the related property should be attached a lazy loader when instances of the parent object are loaded from the database; this is just a callable function that when accessed will invoke a second SQL query to load the child objects of the parent.

# define a mapper
mapper(User, users_table, properties = {
      'addresses' : relation(mapper(Address, addresses_table))
    })

# select users where username is 'jane', get the first element of the list
# this will incur a load operation for the parent table
sqluser = session.query(User).select(User.c.user_name=='jane')[0]
# iterate through the User object's addresses.  this will incur an
# immediate load of those child items
sqlfor a in user.addresses:
print repr(a)
Creating Joins Across Relations

For mappers that have relationships, the select_by method of the Query object can create queries that include automatically created joins. Just specify a key in the argument list which is not present in the primary mapper's list of properties or columns, but is present in the property list of one of its relationships:

sqll = session.query(User).select_by(street='123 Green Street')

The above example is shorthand for:

l = session.query(User).select(and_(
         Address.c.user_id==User.c.user_id, 
         Address.c.street=='123 Green Street')
   )

All keyword arguments sent to select_by are used to create query criterion. This means that familiar select keyword options like order_by and limit are not directly available. To enable these options with select_by, you can try the SelectResults extension which offers methods off the result of a select or select_by such as order_by() and array slicing functions that generate new queries.

Also, select_by will not create joins derived from Column-based expressions (i.e. ClauseElement objects); the reason is that a Column-based expression may include many columns, and select_by has no way to know which columns in the expression correspond to properties and which don't (it also prefers not to dig into column expressions which may be very complex). The next section describes some ways to combine Column expressions with select_by's auto-joining capabilities.

back to section top
More Granular Join Control Using join_to, join_via

Feature Status: Alpha API

The join_to method of Query is a component of the select_by operation, and is given a keyname in order to return a "join path" from the Query's mapper to the mapper which is referenced by a relation() of the given name:

>>> q = session.query(User)
>>> j = q.join_to('addresses')
>>> print j
users.user_id=addresses.user_id

join_to can also be given the name of a column-based property, in which case it will locate a path to the nearest mapper which has that property as a column:

>>> q = session.query(User)
>>> j = q.join_to('street')
>>> print j
users.user_id=addresses.user_id

Also available is the join_via function, which is similar to join_to, except instead of traversing through all properties to find a path to the given key, its given an explicit path to the target property:

>>> q = session.query(User)
>>> j = q.join_via(['orders', 'items'])
>>> print j
users.c.user_id==orders.c.user_id AND orders.c.item_id==items.c.item_id

Expressions produced by join_to and join_via can be used with select to create more complicated query criterion across multiple relations:

>>> l = q.select(
    (addresses_table.c.street=='some address') &
    (items_table.c.item_name=='item #4') &
    q.join_to('addresses') &
    q.join_via(['orders', 'items'])
    )
back to section top
Eager Loading

With just a single parameter lazy=False specified to the relation object, the parent and child SQL queries can be joined together.

mapper(Address, addresses_table)
mapper(User, users_table, properties = {
        'addresses' : relation(Address, lazy=False)
    }
  )

sqlusers = session.query(User).select(User.c.user_name=='Jane')
for u in users:
    print repr(u)
    for a in u.addresses:
        print repr(a)

Above, a pretty ambitious query is generated just by specifying that the User should be loaded with its child Addresses in one query. When the mapper processes the results, it uses an Identity Map to keep track of objects that were already loaded, based on their primary key identity. Through this method, the redundant rows produced by the join are organized into the distinct object instances they represent.

The generation of this query is also immune to the effects of additional joins being specified in the original query. To use our select_by example above, joining against the "addresses" table to locate users with a certain street results in this behavior:

sqlusers = session.query(User).select_by(street='123 Green Street')

The join implied by passing the "street" parameter is stated as an additional join between the addresses and users tables. Also, since the eager join is "aliasized", no name conflict occurs.

back to section top
Using Options to Change the Loading Strategy

The options method on the Query object provides an easy way to get alternate forms of a mapper query from an original one. The most common use of this feature is to change the "eager/lazy" loading behavior of a particular mapper, via the functions eagerload(), lazyload() and noload():

# user mapper with lazy addresses
mapper(User, users_table, properties = {
             'addresses' : relation(mapper(Address, addresses_table))
         }
)

# query object
query = session.query(User)

# make an eager loading query
eagerquery = query.options(eagerload('addresses'))
u = eagerquery.select()

# make another query that wont load the addresses at all
plainquery = query.options(noload('addresses'))

# multiple options can be specified
myquery = oldquery.options(lazyload('tracker'), noload('streets'), eagerload('members'))

# to specify a relation on a relation, separate the property names by a "."
myquery = oldquery.options(eagerload('orders.items'))
back to section top
One to One/Many to One

The above examples focused on the "one-to-many" relationship. To do other forms of relationship is easy, as the relation function can usually figure out what you want:

metadata = MetaData()

# a table to store a user's preferences for a site
prefs_table = Table('user_prefs', metadata,
    Column('pref_id', Integer, primary_key = True),
    Column('stylename', String(20)),
    Column('save_password', Boolean, nullable = False),
    Column('timezone', CHAR(3), nullable = False)
)

# user table with a 'preference_id' column
users_table = Table('users', metadata, 
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(16), nullable = False),
    Column('password', String(20), nullable = False),
    Column('preference_id', Integer, ForeignKey("user_prefs.pref_id"))
)

# engine and some test data
engine = create_engine('sqlite:///', echo=True)
metadata.create_all(engine)
engine.execute(prefs_table.insert(), dict(pref_id=1, stylename='green', save_password=1, timezone='EST'))
engine.execute(users_table.insert(), dict(user_name = 'fred', password='45nfss', preference_id=1))

# classes
class User(object):
    def __init__(self, user_name, password):
        self.user_name = user_name
        self.password = password

class UserPrefs(object):
    pass

mapper(UserPrefs, prefs_table)

mapper(User, users_table, properties = dict(
    preferences = relation(UserPrefs, lazy=False, cascade="all, delete-orphan"),
))

# select
session = create_session(bind_to=engine)
sqluser = session.query(User).get_by(user_name='fred')
save_password = user.preferences.save_password

# modify
user.preferences.stylename = 'bluesteel'

# flush
sqlsession.flush()
back to section top
Many to Many

The relation function handles a basic many-to-many relationship when you specify the association table:

metadata = MetaData()

articles_table = Table('articles', metadata,
    Column('article_id', Integer, primary_key = True),
    Column('headline', String(150), key='headline'),
    Column('body', TEXT, key='body'),
)

keywords_table = Table('keywords', metadata,
    Column('keyword_id', Integer, primary_key = True),
    Column('keyword_name', String(50))
)

itemkeywords_table = Table('article_keywords', metadata,
    Column('article_id', Integer, ForeignKey("articles.article_id")),
    Column('keyword_id', Integer, ForeignKey("keywords.keyword_id"))
)

engine = create_engine('sqlite:///')
metadata.create_all(engine)

# class definitions
class Keyword(object):
    def __init__(self, name):
        self.keyword_name = name

class Article(object):
    pass

mapper(Keyword, keywords_table)

# define a mapper that does many-to-many on the 'itemkeywords' association 
# table
mapper(Article, articles_table, properties = dict(
    keywords = relation(Keyword, secondary=itemkeywords_table, lazy=False)
    )
)

session = create_session(bind_to=engine)

article = Article()
article.headline = 'a headline'
article.body = 'this is the body'
article.keywords.append(Keyword('politics'))
article.keywords.append(Keyword('entertainment'))
session.save(article)

sqlsession.flush()
# select articles based on a keyword.  select_by will handle the extra joins.
sqlarticles = session.query(Article).select_by(keyword_name='politics')
a = articles[0]

# clear out keywords with a new list
a.keywords = []
a.keywords.append(Keyword('topstories'))
a.keywords.append(Keyword('government'))

# flush
sqlsession.flush()
back to section top
Association Object

Many to Many can also be done with an association object, that adds additional information about how two items are related. This association object is set up in basically the same way as any other mapped object. However, since an association table typically has no primary key columns, you have to tell the mapper what columns will compose its "primary key", which are the two (or more) columns involved in the association. Also, the relation function needs an additional hint as to the fact that this mapped object is an association object, via the "association" argument which points to the class or mapper representing the other side of the association.

from sqlalchemy import *
metadata = MetaData()

users_table = Table('users', metadata, 
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(16), nullable = False),
)

articles_table = Table('articles', metadata,
    Column('article_id', Integer, primary_key = True),
    Column('headline', String(150), key='headline'),
    Column('body', TEXT, key='body'),
)

keywords_table = Table('keywords', metadata,
    Column('keyword_id', Integer, primary_key = True),
    Column('keyword_name', String(50))
)

# add "attached_by" column which will reference the user who attached this keyword
itemkeywords_table = Table('article_keywords', metadata,
    Column('article_id', Integer, ForeignKey("articles.article_id")),
    Column('keyword_id', Integer, ForeignKey("keywords.keyword_id")),
    Column('attached_by', Integer, ForeignKey("users.user_id"))
)

engine = create_engine('sqlite:///', echo=True)
metadata.create_all(engine)

# class definitions
class User(object):
    pass
class Keyword(object):
    def __init__(self, name):
        self.keyword_name = name
class Article(object):
    pass
class KeywordAssociation(object):
    pass

mapper(User, users_table)
mapper(Keyword, keywords_table)

# mapper for KeywordAssociation
# specify "primary key" columns manually
mapper(KeywordAssociation, itemkeywords_table,
    primary_key = [itemkeywords_table.c.article_id, itemkeywords_table.c.keyword_id],
    properties={
        'keyword' : relation(Keyword, lazy = False), 
        'user' : relation(User, lazy = False) 
    }
)

# Article mapper, relates to Keyword via KeywordAssociation
mapper(Article, articles_table, properties={
    'keywords':relation(KeywordAssociation, lazy=False, association=Keyword)
    }
)

session = create_session(bind_to=engine)
# select by keyword
sqlalist = session.query(Article).select_by(keyword_name='jacks_stories')
# user is available
for a in alist:
    for k in a.keywords:
        if k.keyword.name == 'jacks_stories':
            print k.user.user_name

Keep in mind that the association object works a little differently from a plain many-to-many relationship. Members have to be added to the list via instances of the association object, which in turn point to the associated object:

user = User()
user.user_name = 'some user'

article = Article()

assoc = KeywordAssociation()
assoc.keyword = Keyword('blue')
assoc.user = user

assoc2 = KeywordAssociation()
assoc2.keyword = Keyword('green')
assoc2.user = user

article.keywords.append(assoc)
article.keywords.append(assoc2)

session.flush()
back to section top

Overview

The concept behind Unit of Work is to track modifications to a field of objects, and then be able to flush those changes to the database in a single operation. Theres a lot of advantages to this, including that your application doesn't need to worry about individual save operations on objects, nor about the required order for those operations, nor about excessive repeated calls to save operations that would be more efficiently aggregated into one step. It also simplifies database transactions, providing a neat package with which to insert into the traditional database begin/commit phase.

SQLAlchemy's unit of work includes these functions:

  • The ability to monitor scalar and list attributes on object instances, as well as object creates. This is handled via the attributes package.
  • The ability to maintain and process a list of modified objects, and based on the relationships set up by the mappers for those objects as well as the foreign key relationships of the underlying tables, figure out the proper order of operations so that referential integrity is maintained, and also so that on-the-fly values such as newly created primary keys can be propigated to dependent objects that need them before they are saved. The central algorithm for this is the topological sort.
  • The ability to define custom functionality that occurs within the unit-of-work flush phase, such as "before insert", "after insert", etc. This is accomplished via MapperExtension.
  • an Identity Map, which is a dictionary storing the one and only instance of an object for a particular table/primary key combination. This allows many parts of an application to get a handle to a particular object without any chance of modifications going to two different places.
  • The sole interface to the unit of work is provided via the Session object. Transactional capability, which rides on top of the transactions provided by Engine objects, is provided by the SessionTransaction object.
  • Thread-locally scoped Session behavior is available as an option, which allows new objects to be automatically added to the Session corresponding to by the default Session context. Without a default Session context, an application must explicitly create a Session manually as well as add new objects to it. The default Session context, disabled by default, can also be plugged in with other user-defined schemes, which may also take into account the specific class being dealt with for a particular operation.
  • The Session object in SQLAlchemy 0.2 borrows conceptually from that of Hibernate, a leading ORM for Java that is largely based on JSR-220. SQLAlchemy, under no obligation to conform to EJB specifications, is in general very different from Hibernate, providing a different paradigm for producing queries, a SQL API that is useable independently of the ORM, and of course Pythonic configuration as opposed to XML; however, JSR-220/Hibernate makes some pretty good suggestions with regards to the mechanisms of persistence.
back to section top
Object States

When dealing with mapped instances with regards to Sessions, an instance may be attached or unattached to a particular Session. An instance also may or may not correspond to an actual row in the database. The product of these two binary conditions yields us four general states a particular instance can have within the perspective of the Session:

  • Transient - a transient instance exists within memory only and is not associated with any Session. It also has no database identity and does not have a corresponding record in the database. When a new instance of a class is constructed, and no default session context exists with which to automatically attach the new instance, it is a transient instance. The instance can then be saved to a particular session in which case it becomes a pending instance. If a default session context exists, new instances are added to that Session by default and therefore become pending instances immediately.

  • Pending - a pending instance is a Session-attached object that has not yet been assigned a database identity. When the Session is flushed (i.e. changes are persisted to the database), a pending instance becomes persistent.

  • Persistent - a persistent instance has a database identity and a corresponding record in the database, and is also associated with a particular Session. By "database identity" we mean the object is associated with a table or relational concept in the database combined with a particular primary key in that table. Objects that are loaded by SQLAlchemy in the context of a particular session are automatically considered persistent, as are formerly pending instances which have been subject to a session flush().

  • Detached - a detached instance is an instance which has a database identity and corresponding row in the database, but is not attached to any Session. This occurs when an instance has been removed from a Session, either because the session itself was cleared or closed, or the instance was explicitly removed from the Session. The object can be re-attached with a session again in which case it becomes Persistent again. Detached instances are useful when an application needs to represent a long-running operation across multiple Sessions, needs to store an object in a serialized state and then restore it later (such as within an HTTP "session" object), or in some cases where code needs to load instances locally which will later be associated with some other Session.

back to section top
Acquiring a Session

A new Session object is constructed via the create_session() function:

session = create_session()

A common option used with create_session() is to specify a specific Engine or Connection to be used for all operations performed by this Session:

# create an engine
e = create_engine('postgres://some/url')

# create a Session that will use this engine for all operations.
# it will open and close Connections as needed.
session = create_session(bind_to=e)

# open a Connection
conn = e.connect()

# create a Session that will use this specific Connection for all operations
session = create_session(bind_to=conn)

The session to which an object is attached can be acquired via the object_session() function, which returns the appropriate Session if the object is pending or persistent, or None if the object is transient or detached:

session = object_session(obj)

It is possible to install a default "threadlocal" session context by importing a mod called sqlalchemy.mods.threadlocal. This mod creates a familiar SA 0.1 keyword objectstore in the sqlalchemy namespace. The objectstore may be used directly like a session; all session actions performed on sqlalchemy.objectstore will be proxied to the thread-local Session:

# install 'threadlocal' mod (only need to call this once per application)
import sqlalchemy.mods.threadlocal

# then 'objectstore' is available within the 'sqlalchemy' namespace
from sqlalchemy import objectstore

# flush the current thread-local session using the objectstore directly
objectstore.flush()

# which is the same as this (assuming we are still on the same thread):
session = objectstore.get_session()
session.flush()

We will now cover some of the key concepts used by Sessions and its underlying Unit of Work.

back to section top
Introduction to the Identity Map

A primary concept of the Session's underlying Unit of Work is that it is keeping track of all persistent instances; recall that a persistent instance has a database identity and is attached to a Session. In particular, the Unit of Work must insure that only one copy of a particular persistent instance exists within the Session at any given time. The UOW accomplishes this task using a dictionary known as an Identity Map. When a Query is used to issue select or get requests to the database, it will in nearly all cases result in an actual SQL execution to the database, and a corresponding traversal of rows received from that execution. However, when the underlying mapper instantiates objects corresponding to the result set rows it receives, it will check the session's identity map first before instantating a new object, and return the same instance already present in the identity map if it already exists, essentially ignoring the object state represented by that row. There are several ways to override this behavior and truly refresh an already-loaded instance which are described later, but the main idea is that once your instance is loaded into a particular Session, it will never change its state without your explicit approval, regardless of what the database says about it.

For example; below, two separate calls to load an instance with database identity "15" are issued, and the results assigned to two separate variables. However, since the same Session was used, the two instances are the same instance:

mymapper = mapper(MyClass, mytable)

session = create_session()
obj1 = session.query(MyClass).selectfirst(mytable.c.id==15)
obj2 = session.query(MyClass).selectfirst(mytable.c.id==15)

>>> obj1 is obj2
True

The Identity Map is an instance of weakref.WeakValueDictionary, so that when an in-memory object falls out of scope, it will be removed automatically. However, this may not be instant if there are circular references upon the object. To guarantee that an instance is removed from the identity map before removing references to it, use the expunge() method, described later, to remove it.

The Session supports an iterator interface in order to see all objects in the identity map:

for obj in session:
    print obj

As well as __contains__():

if obj in session:
    print "Object is present"

The identity map itself is accessible via the identity_map accessor:

>>> session.identity_map.values()
[<__main__.User object at 0x712630>, <__main__.Address object at 0x712a70>]

The identity of each object instance is available via the _instance_key property attached to each object instance, and is a tuple consisting of the object's class and an additional tuple of primary key values, in the order that they appear within the table definition:

>>> obj._instance_key 
(<class 'test.tables.User'>, (7,))

At the moment that an object is assigned this key within a flush() operation, it is also added to the session's identity map.

The get() method on Query, which retrieves an object based on primary key identity, also checks in the Session's identity map first to save a database round-trip if possible. In the case of an object lazy-loading a single child object, the get() method is used as well, so scalar-based lazy loads may in some cases not query the database; this is particularly important for backreference relationships as it can save a lot of queries.

back to section top
Whats Changed ?

The next concept is that in addition to the Session storing a record of all objects loaded or saved, it also stores lists of all newly created (i.e. pending) objects, lists of all persistent objects whose attributes have been modified, and lists of all persistent objects that have been marked as deleted. These lists are used when a flush() call is issued to save all changes. After the flush occurs, these lists are all cleared out.

These records are all tracked by a collection of Set objects (which are a SQLAlchemy-specific instance called a HashSet) that are also viewable off the Session:

# pending objects recently added to the Session
session.new

# persistent objects with modifications
session.dirty

# persistent objects that have been marked as deleted via session.delete(obj)
session.deleted

Unlike the identity map, the new, dirty, and deleted lists are not weak referencing. This means if you abandon all references to new or modified objects within a session, they are still present and will be saved on the next flush operation, unless they are removed from the Session explicitly (more on that later). The new list may change in a future release to be weak-referencing, however for the deleted list, one can see that its quite natural for a an object marked as deleted to have no references in the application, yet a DELETE operation is still required.

back to section top
The Session API
query()

The query() function takes a class or Mapper as an argument, along with an optional entity_name parameter, and returns a new Query object which will issue mapper queries within the context of this Session. If a Mapper is passed, then the Query uses that mapper. Otherwise, if a class is sent, it will locate the primary mapper for that class which is used to construct the Query.

# query from a class
session.query(User).select_by(name='ed')

# query from a mapper
query = session.query(usermapper)
x = query.get(1)

# query from a class mapped with entity name 'alt_users'
q = session.query(User, entity_name='alt_users')
y = q.options(eagerload('orders')).select()

entity_name is an optional keyword argument sent with a class object, in order to further qualify which primary mapper to be used; this only applies if there was a Mapper created with that particular class/entity name combination, else an exception is raised. All of the methods on Session which take a class or mapper argument also take the entity_name argument, so that a given class can be properly matched to the desired primary mapper.

All instances retrieved by the returned Query object will be stored as persistent instances within the originating Session.

back to section top
get()

Given a class or mapper, a scalar or tuple-based identity, and an optional entity_name keyword argument, creates a Query corresponding to the given mapper or class/entity_name combination, and calls the get() method with the given identity value. If the object already exists within this Session, it is simply returned, else it is queried from the database. If the instance is not found, the method returns None.

# get Employer primary key 5
employer = session.get(Employer, 5)

# get Report composite primary key 7,12, using mapper 'report_mapper_b'
report = session.get(Report, (7,12), entity_name='report_mapper_b')
back to section top
load()

load() is similar to get() except it will raise an exception if the instance does not exist in the database. It will also load the object's data from the database in all cases, and overwrite all changes on the object if it already exists in the session with the latest data from the database.

# load Employer primary key 5
employer = session.load(Employer, 5)

# load Report composite primary key 7,12, using mapper 'report_mapper_b'
report = session.load(Report, (7,12), entity_name='report_mapper_b')
back to section top
save()

save() is called with a single transient (unsaved, unattached) instance as an argument, which is then added to the Session and becomes pending. When the session is next flushed, the instance will be saved to the database uponwhich it becomes persistent (saved, attached). If the given instance is not transient, meaning it is either attached to an existing Session or it has a database identity, an exception is raised.

user1 = User(name='user1')
user2 = User(name='user2')
session.save(user1)
session.save(user2)

session.flush()     # write changes to the database

save() is called automatically for new instances by the classes' associated mapper, if a default Session context is in effect (such as a thread-local session), which means that newly created instances automatically become pending. If there is no default session available, then the instance remains transient (unattached) until it is explicitly added to a Session via the save() method.

A transient instance also can be automatically saveed if it is associated with a parent object which specifies save-update within its cascade rules, and that parent is already attached or becomes attached to a Session. For more information on cascade, see the next section.

The save_or_update() method, covered later, is a convenience method which will call the save() or update() methods appropriately dependening on whether or not the instance has a database identity (but the instance still must be unattached).

back to section top
flush()

This is the main gateway to what the Unit of Work does best, which is save everything ! It should be clear by now what a flush looks like:

session.flush()

It also can be called with a list of objects; in this form, the flush operation will be limited only to the objects specified in the list, as well as any child objects within private relationships for a delete operation:

# saves only user1 and address2.  all other modified
# objects remain present in the session.
session.flush([user1, address2])

This second form of flush should be used carefully as it will not necessarily locate other dependent objects within the session, whose database representation may have foreign constraint relationships with the objects being operated upon.

Notes on Flush

A common misconception about the flush() operation is that once performed, the newly persisted instances will automatically have related objects attached to them, based on the values of primary key identities that have been assigned to the instances before they were persisted. An example would be, you create a new Address object, set address.user_id to 5, and then flush() the session. The erroneous assumption would be that there is now a User object of identity "5" attached to the Address object, but in fact this is not the case. If you were to refresh() the Address, invalidating its current state and re-loading, then it would have the appropriate User object present.

This misunderstanding is related to the observed behavior of backreferences (Backreferences), which automatically associates an instance "A" with another instance "B", in response to the manual association of instance "B" to instance "A" by the user. The backreference operation occurs completely externally to the flush() operation, and is pretty much the only example of a SQLAlchemy feature that manipulates the relationships of persistent objects.

The primary guideline for dealing with flush() is, the developer is responsible for maintaining in-memory objects and their relationships to each other, the unit of work is responsible for maintaining the database representation of the in-memory objects. The typical pattern is that the manipulation of objects is the way that changes get communicated to the unit of work, so that when the flush occurs, the objects are already in their correct in-memory representation and problems dont arise. The manipulation of identifier attributes like integer key values as well as deletes in particular are a frequent source of confusion.

back to section top
close()

This method first calls clear(), removing all objects from this Session, and then insures that any transactional resources are closed.

back to section top
delete()

The delete method places an instance into the Unit of Work's list of objects to be marked as deleted:

# mark two objects to be deleted
session.delete(obj1)
session.delete(obj2)

# flush
session.flush()

The delete operation will have an effect on instances that are attached to the deleted instance according to the cascade style of the relationship; cascade rules are described further in the following section. By default, associated instances may need to be updated in the database to reflect that they no longer are associated with the parent object, before the parent is deleted. If the relationship specifies cascade="delete", then the associated instance will also be deleted upon flush, assuming it is still attached to the parent. If the relationship additionally includes the delete-orphan cascade style, the associated instance will be deleted if it is still attached to the parent, or is unattached to any other parent.

The delete() operation has no relationship to the in-memory status of the instance, including usage of the del Python statement. An instance marked as deleted and flushed will still exist within memory until references to it are freed; similarly, removing an instance from memory via the del statement will have no effect, since the persistent instance will still be referenced by its Session. Obviously, if the instance is removed from the Session and then totally dereferenced, it will no longer exist in memory, but also won't exist in any Session and is therefore not deleted from the database.

back to section top
clear()

This method detaches all instances from the Session, sending them to the detached or transient state as applicable, and replaces the underlying UnitOfWork with a new one.

session.clear()

The clear() method is particularly useful with a "default context" session such as a thread-local session, which can stay attached to the current thread to handle a new field of objects without having to re-attach a new Session.

back to section top
refresh() / expire()

To assist with the Unit of Work's "sticky" behavior, individual objects can have all of their attributes immediately re-loaded from the database, or marked as "expired" which will cause a re-load to occur upon the next access of any of the object's mapped attributes. This includes all relationships, so lazy-loaders will be re-initialized, eager relationships will be repopulated. Any changes marked on the object are discarded:

# immediately re-load attributes on obj1, obj2
session.refresh(obj1)
session.refresh(obj2)

# expire objects obj1, obj2, attributes will be reloaded
# on the next access:
session.expire(obj1)
session.expire(obj2)
back to section top
expunge()

Expunge removes an object from the Session, sending persistent instances to the detached state, and pending instances to the transient state:

session.expunge(obj1)

Use expunge when youd like to remove an object altogether from memory, such as before calling del on it, which will prevent any "ghost" operations occuring when the session is flushed.

back to section top
bind_mapper() / bind_table()

Both of these methods receive two arguments; in the case of bind_mapper(), it is a Mapper and an Engine or Connection instance; in the case of bind_table(), it is a Table instance or other Selectable (such as an Alias, Select, etc.), and an Engine or Connection instance.

engine1 = create_engine('sqlite:///file1.db')
engine2 = create_engine('mysql://localhost')

sqlite_conneciton = engine1.connect()

sess = create_session()

sess.bind_mapper(mymapper, sqlite_connection)  # bind mymapper operations to a single SQLite connection
sess.bind_table(email_addresses_table, engine2) # bind operations with the email_addresses_table to mysql

Normally, when a Session is created via create_session() with no arguments, the Session has no awareness of individual Engines, and when mappers use the Session to retrieve connections, the underlying MetaData each Table is associated with is expected to be "bound" to an Engine, else no engine can be located and an exception is raised. A second form of create_session() takes the argument bind_to=engine_or_connection, where all SQL operations performed by this Session use the single Engine or Connection (collectively known as a Connectable) passed to the constructor. With bind_mapper() and bind_table(), the operations of individual mapper and/or tables are bound to distinct engines or connections, thereby overriding not only the engine which may be "bound" to the underlying MetaData, but also the Engine or Connection which may have been passed to the create_session() function. Configurations which interact with multiple explicit database connections at one time must use either or both of these methods in order to associate Session operations with the appropriate connection resource.

Binding a Mapper to a resource takes precedence over a Table bind, meaning if mapper A is associated with table B, and the Session binds mapper A to connection X and table B to connection Y, an operation with mapper A will use connection X, not connection Y.

back to section top
update()

The update() method is used only with detached instances. A detached instance only exists if its Session was cleared or closed, or the instance was expunge()d from its session. update() will re-attach the detached instance with this Session, bringing it back to the persistent state, and allowing any changes on the instance to be saved when the Session is next flushed. If the instance is already attached to an existing Session, an exception is raised.

A detached instance also can be automatically updateed if it is associated with a parent object which specifies save-update within its cascade rules, and that parent is already attached or becomes attached to a Session. For more information on cascade, see the next section.

The save_or_update() method is a convenience method which will call the save() or update() methods appropriately dependening on whether or not the instance has a database identity (but the instance still must be unattached).

back to section top
save_or_update()

This method is a combination of the save() and update() methods, which will examine the given instance for a database identity (i.e. if it is transient or detached), and will call the implementation of save() or update() as appropriate. Use save_or_update() to add unattached instances to a session when you're not sure if they were newly created or not. Like save() and update(), save_or_update() cascades along the save-update cascade indicator, described in the cascade section below.

back to section top
merge()

Feature Status: Alpha Implementation

merge() is used to return the persistent version of an instance that is not attached to this Session. When passed an instance, if an instance with its database identity already exists within this Session, it is returned. If the instance does not exist in this Session, it is loaded from the database and then returned.

A future version of merge() will also update the Session's instance with the state of the given instance (hence the name "merge").

This method is useful for bringing in objects which may have been restored from a serialization, such as those stored in an HTTP session:

# deserialize an object
myobj = pickle.loads(mystring)

# "merge" it.  if the session already had this object in the 
# identity map, then you get back the one from the current session.
myobj = session.merge(myobj)

Note that merge() does not associate the given instance with the Session; it remains detached (or attached to whatever Session it was already attached to).

back to section top
Cascade rules

Feature Status: Alpha Implementation

Mappers support the concept of configurable cascade behavior on relation()s. This behavior controls how the Session should treat the instances that have a parent-child relationship with another instance that is operated upon by the Session. Cascade is indicated as a comma-separated list of string keywords, with the possible values all, delete, save-update, refresh-expire, merge, expunge, and delete-orphan.

Cascading is configured by setting the cascade keyword argument on a relation():

mapper(Order, order_table, properties={
    'items' : relation(Item, items_table, cascade="all, delete-orphan"),
    'customer' : relation(User, users_table, user_orders_table, cascade="save-update"),
})

The above mapper specifies two relations, items and customer. The items relationship specifies "all, delete-orphan" as its cascade value, indicating that all save, update, merge, expunge, refresh delete and expire operations performed on a parent Order instance should also be performed on the child Item instances attached to it (save and update are cascaded using the save_or_update() method, so that the database identity of the instance doesn't matter). The delete-orphan cascade value additionally indicates that if an Item instance is no longer associated with an Order, it should also be deleted. The "all, delete-orphan" cascade argument allows a so-called lifecycle relationship between an Order and an Item object.

The customer relationship specifies only the "save-update" cascade value, indicating most operations will not be cascaded from a parent Order instance to a child User instance, except for if the Order is attached with a particular session, either via the save(), update(), or save-update() method.

Additionally, when a child item is attached to a parent item that specifies the "save-update" cascade value on the relationship, the child is automatically passed to save_or_update() (and the operation is further cascaded to the child item).

Note that cascading doesn't do anything that isn't possible by manually calling Session methods on individual instances within a hierarchy, it merely automates common operations on a group of associated instances.

The default value for cascade on relation()s is save-update, and the private=True keyword argument is a synonym for cascade="all, delete-orphan".

back to section top
SessionTransaction

SessionTransaction is a multi-engine transaction manager, which aggregates one or more Engine/Connection pairs and keeps track of a Transaction object for each one. As the Session receives requests to execute SQL statements, it uses the Connection that is referenced by the SessionTransaction. At commit time, the underyling Session is flushed, and each Transaction is the committed.

Example usage is as follows:

sess = create_session()
trans = sess.create_transaction()
try:
    item1 = sess.query(Item).get(1)
    item2 = sess.query(Item).get(2)
    item1.foo = 'bar'
    item2.bar = 'foo'
    trans.commit()
except:
    trans.rollback()
    raise

The create_transaction() method creates a new SessionTransaction object but does not declare any connection/transaction resources. At the point of the first get() call, a connection resource is opened off the engine that corresponds to the Item classes' mapper and is stored within the SessionTransaction with an open Transaction. When trans.commit() is called, the flush() method is called on the Session and the corresponding update statements are issued to the database within the scope of the transaction already opened; afterwards, the underying Transaction is committed, and connection resources are freed.

SessionTransaction, like the Transaction off of Connection also supports "nested" behavior, and is safe to pass to other functions which then issue their own begin()/commit() pair; only the outermost begin()/commit() pair actually affects the transaction, and any call to rollback() within a particular call stack will issue a rollback.

Note that while SessionTransaction is capable of tracking multiple transactions across multiple databases, it currently is in no way a fully functioning two-phase commit engine; generally, when dealing with multiple databases simultaneously, there is the distinct possibility that a transaction can succeed on the first database and fail on the second, which for some applications may be an invalid state. If this is an issue, its best to either refrain from spanning transactions across databases, or to look into some of the available technologies in this area, such as Zope which offers a two-phase commit engine; some users have already created their own SQLAlchemy/Zope hybrid implementations to deal with scenarios like these.

Using SQL with SessionTransaction

The SessionTransaction can interact with direct SQL queries in two general ways. Either specific Connection objects can be associated with the SessionTransaction, which are then useable both for direct SQL as well as within flush() operations performed by the SessionTransaction, or via accessing the Connection object automatically referenced within the SessionTransaction.

To associate a specific Connection with the SessionTransaction, use the add() method:

Associate a Connection with the SessionTransaction
connection = engine.connect()
trans = session.create_transaction()
try:
    trans.add(connection)
    connection.execute(mytable.update(), {'col1':4, 'col2':17})
    session.flush() # flush() operation will use the same connection
    trans.commit()  
except:
    trans.rollback()
    raise

The add() method will key the Connection's underlying Engine to this SessionTransaction. When mapper operations are performed against this Engine, the Connection explicitly added will be used. This overrides any other Connection objects that the underlying Session was associated with, corresponding to the underlying Engine of that Connection. However, if the SessionTransaction itself is already associated with a Connection, then an exception is thrown.

The other way is just to use the Connection referenced by the SessionTransaction. This is performed via the connection() method, and requires passing in a class or Mapper which indicates which underlying Connection should be returned (recall that different Mappers may use different underlying Engines). If the class_or_mapper argument is None, then the Session must be globally bound to a specific Engine when it was constructed, else the method returns None.

Get a Connection from the SessionTransaction
trans = session.create_transaction()
try:
    connection = trans.connection(UserClass)   # get the Connection used by the UserClass' Mapper
    connection.execute(mytable.update(), {'col1':4, 'col2':17})
    trans.commit()
except:
    trans.rollback()
    raise

The connection() method also exists on the Session object itself, and can be called regardless of whether or not a SessionTransaction is in progress. If a SessionTransaction is in progress, it will return the connection referenced by the transaction. If an Engine is being used with threadlocal strategy, the Connection returned will correspond to the connection resources that are bound to the current thread, if any (i.e. it is obtained by calling contextual_connection()).

back to section top
Using Engine-level Transactions with Sessions

The transactions issued by SessionTransaction as well as internally by the Session's flush() operation use the same Transaction object off of Connection that is publically available. Recall that this object supports "nestable" behavior, meaning any number of actors can call begin() off a particular Connection object, and they will all be managed within the scope of a single transaction. Therefore, the flush() operation can similarly take place within the scope of a regular Transaction:

Transactions with Sessions
connection = engine.connect()   # Connection
session = create_session(bind_to=connection) # Session bound to the Connection
trans = connection.begin()      # start transaction
try:
    stuff = session.query(MyClass).select()     # Session operation uses connection
    stuff[2].foo = 'bar'
    connection.execute(mytable.insert(), dict(id=12, value="bar"))    # use connection explicitly
    session.flush()     # Session flushes with "connection", using transaction "trans"
    trans.commit()      # commit
except:
    trans.rollback()    # or rollback
    raise
back to section top
Analyzing Object Flushes

The session module can log an extensive display of its "flush plans", which is a graph of its internal representation of objects before they are written to the database. To turn this logging on:

# make an Session with echo_uow
session = create_session(echo_uow=True)

The flush() operation will then dump to the standard output displays like the following:

Task dump:

 UOWTask(6034768, 'User/users/None')
  |
  |- Save User(6016624)
  |       |-Process User(6016624).addresses
  |
  |- UOWTask(6034832, 'Address/email_addresses/None')
  |   |- Save Address(6034384)
  |   |- Save Address(6034256)
  |   |----
  | 
  |----

The above graph can be read straight downwards to determine the order of operations. It indicates "save User 6016624, process each element in the 'addresses' list on User 6016624, save Address 6034384, Address 6034256".

Of course, one can also get a good idea of the order of operations just by logging the actual SQL statements executed.

back to section top

This section details all the options available to Mappers, as well as advanced patterns.

To start, heres the tables we will work with again:

from sqlalchemy import *

metadata = MetaData()

# a table to store users
users_table = Table('users', metadata,
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(40)),
    Column('password', String(80))
)

# a table that stores mailing addresses associated with a specific user
addresses_table = Table('addresses', metadata,
    Column('address_id', Integer, primary_key = True),
    Column('user_id', Integer, ForeignKey("users.user_id")),
    Column('street', String(100)),
    Column('city', String(80)),
    Column('state', String(2)),
    Column('zip', String(10))
)

# a table that stores keywords
keywords_table = Table('keywords', metadata,
    Column('keyword_id', Integer, primary_key = True),
    Column('name', VARCHAR(50))
)

# a table that associates keywords with users
userkeywords_table = Table('userkeywords', metadata,
    Column('user_id', INT, ForeignKey("users")),
    Column('keyword_id', INT, ForeignKey("keywords"))
)
More On Mapper Properties
Overriding Column Names

When mappers are constructed, by default the column names in the Table metadata are used as the names of attributes on the mapped class. This can be customzed within the properties by stating the key/column combinations explicitly:

user_mapper = mapper(User, users_table, properties={
    'id' : users_table.c.user_id,
    'name' : users_table.c.user_name,
})

In the situation when column names overlap in a mapper against multiple tables, columns may be referenced together with a list:

# join users and addresses
usersaddresses = sql.join(users_table, addresses_table, users_table.c.user_id == addresses_table.c.user_id)
m = mapper(User, usersaddresses,   
    properties = {
        'id' : [users_table.c.user_id, addresses_table.c.user_id],
    }
    )
back to section top
Overriding Properties

A common request is the ability to create custom class properties that override the behavior of setting/getting an attribute. Currently, the easiest way to do this in SQLAlchemy is how it would be done in any Python program; define your attribute with a different name, such as "_attribute", and use a property to get/set its value. The mapper just needs to be told of the special name:

class MyClass(object):
    def _set_email(self, email):
       self._email = email
    def _get_email(self, email):
       return self._email
    email = property(_get_email, _set_email)

mapper(MyClass, mytable, properties = {
   # map the '_email' attribute to the "email" column
   # on the table
   '_email': mytable.c.email
})

In a later release, SQLAlchemy will also allow _get_email and _set_email to be attached directly to the "email" property created by the mapper, and will also allow this association to occur via decorators.

back to section top
Custom List Classes

Feature Status: Alpha API

A one-to-many or many-to-many relationship results in a list-holding element being attached to all instances of a class. Currently, this list is an instance of sqlalchemy.util.HistoryArraySet, is a UserDict instance that decorates an underlying list object. The implementation of this list can be controlled, and can in fact be any object that implements a list-style append and __iter__ method. A common need is for a list-based relationship to actually be a dictionary. This can be achieved by subclassing dict to have list-like behavior.

In this example, a class MyClass is defined, which is associated with a parent object MyParent. The collection of MyClass objects on each MyParent object will be a dictionary, storing each MyClass instance keyed to its name attribute.

# a class to be stored in the list
class MyClass(object):
    def __init__(self, name):
        self.name = name

# create a dictionary that will act like a list, and store
# instances of MyClass
class MyDict(dict):
    def append(self, item):
        self[item.name] = item
    def __iter__(self):
        return self.values()

# parent class
class MyParent(object):
    # this class-level attribute provides the class to be
    # used by the 'myclasses' attribute
    myclasses = MyDict

# mappers, constructed normally
mapper(MyClass, myclass_table)
mapper(MyParent, myparent_table, properties={
    'myclasses' : relation(MyClass)
})

# elements on 'myclasses' can be accessed via string keyname
myparent = MyParent()
myparent.myclasses.append(MyClass('this is myclass'))
myclass = myparent.myclasses['this is myclass']
back to section top
Custom Join Conditions

When creating relations on a mapper, most examples so far have illustrated the mapper and relationship joining up based on the foreign keys of the tables they represent. in fact, this "automatic" inspection can be completely circumvented using the primaryjoin and secondaryjoin arguments to relation, as in this example which creates a User object which has a relationship to all of its Addresses which are in Boston:

class User(object):
    pass
class Address(object):
    pass

mapper(Address, addresses_table)
mapper(User, users_table, properties={
    'boston_addreses' : relation(Address, primaryjoin=
                and_(users_table.c.user_id==Address.c.user_id, 
                Addresses.c.city=='Boston'))
})

Many to many relationships can be customized by one or both of primaryjoin and secondaryjoin, shown below with just the default many-to-many relationship explicitly set:

class User(object):
    pass
class Keyword(object):
    pass
mapper(Keyword, keywords_table)
mapper(User, users_table, properties={
    'keywords':relation(Keyword, secondary=userkeywords_table
        primaryjoin=users_table.c.user_id==userkeywords_table.c.user_id,
        secondaryjoin=userkeywords_table.c.keyword_id==keywords_table.c.keyword_id
        )
})
back to section top
Lazy/Eager Joins Multiple Times to One Table

The previous example leads in to the idea of joining against the same table multiple times. Below is a User object that has lists of its Boston and New York addresses:

mapper(User, users_table, properties={
    'boston_addreses' : relation(Address, primaryjoin=
                and_(users_table.c.user_id==Address.c.user_id, 
                Addresses.c.city=='Boston')),
    'newyork_addresses' : relation(Address, primaryjoin=
                and_(users_table.c.user_id==Address.c.user_id, 
                Addresses.c.city=='New York')),
})

Both lazy and eager loading support multiple joins equally well.

back to section top
Deferred Column Loading

This feature allows particular columns of a table to not be loaded by default, instead being loaded later on when first referenced. It is essentailly "column-level lazy loading". This feature is useful when one wants to avoid loading a large text or binary field into memory when its not needed. Individual columns can be lazy loaded by themselves or placed into groups that lazy-load together.

book_excerpts = Table('books', db, 
    Column('book_id', Integer, primary_key=True),
    Column('title', String(200), nullable=False),
    Column('summary', String(2000)),
    Column('excerpt', String),
    Column('photo', Binary)
)

class Book(object):
    pass

# define a mapper that will load each of 'excerpt' and 'photo' in 
# separate, individual-row SELECT statements when each attribute
# is first referenced on the individual object instance
mapper(Book, book_excerpts, properties = {
    'excerpt' : deferred(book_excerpts.c.excerpt),
    'photo' : deferred(book_excerpts.c.photo)
})

Deferred columns can be placed into groups so that they load together:

book_excerpts = Table('books', db, 
    Column('book_id', Integer, primary_key=True),
    Column('title', String(200), nullable=False),
    Column('summary', String(2000)),
    Column('excerpt', String),
    Column('photo1', Binary),
    Column('photo2', Binary),
    Column('photo3', Binary)
)

class Book(object):
    pass

# define a mapper with a 'photos' deferred group.  when one photo is referenced,
# all three photos will be loaded in one SELECT statement.  The 'excerpt' will 
# be loaded separately when it is first referenced.
mapper(Book, book_excerpts, properties = {
    'excerpt' : deferred(book_excerpts.c.excerpt),
    'photo1' : deferred(book_excerpts.c.photo1, group='photos'),
    'photo2' : deferred(book_excerpts.c.photo2, group='photos'),
    'photo3' : deferred(book_excerpts.c.photo3, group='photos')
})
back to section top
Relation Options

Keyword options to the relation function include:

  • lazy=(True|False|None) - specifies how the related items should be loaded. a value of True indicates they should be loaded when the property is first accessed. A value of False indicates they should be loaded by joining against the parent object query, so parent and child are loaded in one round trip. A value of None indicates the related items are not loaded by the mapper in any case; the application will manually insert items into the list in some other way. A relationship with lazy=None is still important; items added to the list or removed will cause the appropriate updates and deletes upon flush(). Future capabilities for lazy might also include "lazy='extra'", which would allow lazy loading of child elements one at a time, for very large collections.
  • cascade - a string list of cascade rules which determines how persistence operations should be "cascaded" from parent to child. For a description of cascade rules, see datamapping_relations_cycle and Cascade rules.
  • secondary - for a many-to-many relationship, specifies the intermediary table.
  • primaryjoin - a ClauseElement that will be used as the primary join of this child object against the parent object, or in a many-to-many relationship the join of the primary object to the association table. By default, this value is computed based on the foreign key relationships of the parent and child tables (or association table).
  • secondaryjoin - a ClauseElement that will be used as the join of an association table to the child object. By default, this value is computed based on the foreign key relationships of the association and child tables.
  • foreignkey - specifies which column in this relationship is "foreign", i.e. which column refers to the parent object. This value is automatically determined in most cases based on the primary and secondary join conditions, except in the case of a self-referential mapper, where it is needed to indicate the child object's reference back to its parent, or in the case where the join conditions do not represent any primary key columns to properly represent the direction of the relationship.
  • uselist - a boolean that indicates if this property should be loaded as a list or a scalar. In most cases, this value is determined based on the type and direction of the relationship - one to many forms a list, many to one forms a scalar, many to many is a list. If a scalar is desired where normally a list would be present, such as a bi-directional one-to-one relationship, set uselist to False.
  • private - setting private=True is the equivalent of setting cascade="all, delete-orphan", and indicates the lifecycle of child objects should be contained within that of the parent. See the example in datamapping_relations_cycle.
  • backref - indicates the name of a property to be placed on the related mapper's class that will handle this relationship in the other direction, including synchronizing the object attributes on both sides of the relation. Can also point to a backref() construct for more configurability. See Backreferences.
  • order_by - indicates the ordering that should be applied when loading these items. See the section Controlling Ordering for details.
  • association - When specifying a many to many relationship with an association object, this keyword should reference the mapper or class of the target object of the association. See the example in Association Object.
  • postupdate - this indicates that the relationship should be handled by a second UPDATE statement after an INSERT, or before a DELETE. using this flag essentially means the relationship will not incur any "dependency" between parent and child item, as the particular foreign key relationship between them is handled by a second statement. use this flag when a particular mapping arrangement will incur two rows that are dependent on each other, such as a table that has a one-to-many relationship to a set of child rows, and also has a column that references a single child row within that list (i.e. both tables contain a foreign key to each other). If a flush() operation returns an error that a "cyclical dependency" was detected, this is a cue that you might want to use postupdate.
back to section top
Controlling Ordering

By default, mappers will attempt to ORDER BY the "oid" column of a table, or the primary key column, when selecting rows. This can be modified in several ways.

The "order_by" parameter can be sent to a mapper, overriding the per-engine ordering if any. A value of None means that the mapper should not use any ordering. A non-None value, which can be a column, an asc or desc clause, or an array of either one, indicates the ORDER BY clause that should be added to all select queries:

# disable all ordering
mapper = mapper(User, users_table, order_by=None)

# order by a column
mapper = mapper(User, users_table, order_by=users_tableusers_table.c.user_id)

# order by multiple items
mapper = mapper(User, users_table, order_by=[users_table.c.user_id, desc(users_table.c.user_name)])

"order_by" can also be specified to an individual select method, overriding all other per-engine/per-mapper orderings:

# order by a column
l = mapper.select(users_table.c.user_name=='fred', order_by=users_table.c.user_id)

# order by multiple criterion
l = mapper.select(users_table.c.user_name=='fred', order_by=[users_table.c.user_id, desc(users_table.c.user_name)])

For relations, the "order_by" property can also be specified to all forms of relation:

# order address objects by address id
mapper = mapper(User, users_table, properties = {
    'addresses' : relation(mapper(Address, addresses_table), order_by=addresses_table.c.address_id)
})

# eager load with ordering - the ORDER BY clauses of parent/child will be organized properly
mapper = mapper(User, users_table, properties = {
    'addresses' : relation(mapper(Address, addresses_table), order_by=desc(addresses_table.c.email_address), eager=True)
}, order_by=users_table.c.user_id)
back to section top
Limiting Rows

You can limit rows in a regular SQL query by specifying limit and offset. A Mapper can handle the same concepts:

class User(object):
    pass

mapper(User, users_table)
sqlr = session.query(User).select(limit=20, offset=10)

However, things get tricky when dealing with eager relationships, since a straight LIMIT of rows does not represent the count of items when joining against other tables to load related items as well. So here is what SQLAlchemy will do when you use limit or offset with an eager relationship:

class User(object):
    pass
class Address(object):
    pass
    mapper(User, users_table, properties={
    'addresses' : relation(mapper(Address, addresses_table), lazy=False)
})
r = session.query(User).select(User.c.user_name.like('F%'), limit=20, offset=10)
SELECT users.user_id AS users_user_id, users.user_name AS users_user_name,
users.password AS users_password, addresses.address_id AS addresses_address_id,
addresses.user_id AS addresses_user_id, addresses.street AS addresses_street,
addresses.city AS addresses_city, addresses.state AS addresses_state,
addresses.zip AS addresses_zip
FROM
(SELECT users.user_id FROM users WHERE users.user_name LIKE %(users_user_name)s
ORDER BY users.oid LIMIT 20 OFFSET 10) AS rowcount,
users LEFT OUTER JOIN addresses ON users.user_id = addresses.user_id
WHERE rowcount.user_id = users.user_id ORDER BY users.oid, addresses.oid
{'users_user_name': 'F%'}

The main WHERE clause as well as the limiting clauses are coerced into a subquery; this subquery represents the desired result of objects. A containing query, which handles the eager relationships, is joined against the subquery to produce the result.

back to section top
More on Mapper Options

The options method on the Query object, first introduced in None, produces a new Query object by creating a copy of the underlying Mapper and placing modified properties on it. The options method is also directly available off the Mapper object itself, so that the newly copied Mapper can be dealt with directly. The options method takes a variable number of MapperOption objects which know how to change specific things about the mapper. The five available options are eagerload, lazyload, noload, deferred and extension.

An example of a mapper with a lazy load relationship, upgraded to an eager load relationship:

class User(object):
    pass
class Address(object):
    pass

# a 'lazy' relationship
mapper(User, users_table, properties = {
    'addreses':relation(mapper(Address, addresses_table), lazy=True)
})

# copy the mapper and convert 'addresses' to be eager
eagermapper = class_mapper(User).options(eagerload('addresses'))

The defer and undefer options can control the deferred loading of attributes:

# set the 'excerpt' deferred attribute to load normally
m = book_mapper.options(undefer('excerpt'))

# set the referenced mapper 'photos' to defer its loading of the column 'imagedata'
m = book_mapper.options(defer('photos.imagedata'))
back to section top
Mapping a Class with Table Inheritance

Feature Status: Alpha Implementation

Inheritance in databases comes in three forms: single table inheritance, where several types of classes are stored in one table, concrete table inheritance, where each type of class is stored in its own table, and multiple table inheritance, where the parent/child classes are stored in their own tables that are joined together in a select.

There is also a concept of polymorphic loading, which indicates if multiple kinds of classes can be loaded in one pass.

SQLAlchemy supports all three kinds of inheritance. Additionally, true polymorphic loading is supported in a straightfoward way for single table inheritance, and has some more manually-configured features that can make it happen for concrete and multiple table inheritance.

Working examples of polymorphic inheritance come with the distribution in the directory examples/polymorphic.

Here are the classes we will use to represent an inheritance relationship:

class Employee(object):
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return self.__class__.__name__ + " " + self.name

class Manager(Employee):
    def __init__(self, name, manager_data):
        self.name = name
        self.manager_data = manager_data
    def __repr__(self):
        return self.__class__.__name__ + " " + self.name + " " +  self.manager_data

class Engineer(Employee):
    def __init__(self, name, engineer_info):
        self.name = name
        self.engineer_info = engineer_info
    def __repr__(self):
        return self.__class__.__name__ + " " + self.name + " " +  self.engineer_info

Each class supports a common name attribute, while the Manager class has its own attribute manager_data and the Engineer class has its own attribute engineer_info.

Single Table Inheritance

This will support polymorphic loading via the Employee mapper.

employees_table = Table('employees', metadata, 
    Column('employee_id', Integer, primary_key=True),
    Column('name', String(50)),
    Column('manager_data', String(50)),
    Column('engineer_info', String(50)),
    Column('type', String(20))
)

employee_mapper = mapper(Employee, employees_table, polymorphic_on=employees_table.c.type)
manager_mapper = mapper(Manager, inherits=employee_mapper, polymorphic_identity='manager')
engineer_mapper = mapper(Engineer, inherits=employee_mapper, polymorphic_identity='engineer')
back to section top
Concrete Table Inheritance

Without polymorphic loading, you just define a separate mapper for each class.

Concrete Inheritance, Non-polymorphic
managers_table = Table('managers', metadata, 
    Column('employee_id', Integer, primary_key=True),
    Column('name', String(50)),
    Column('manager_data', String(50)),
)

engineers_table = Table('engineers', metadata, 
    Column('employee_id', Integer, primary_key=True),
    Column('name', String(50)),
    Column('engineer_info', String(50)),
)

manager_mapper = mapper(Manager, managers_table)
engineer_mapper = mapper(Engineer, engineers_table)

With polymorphic loading, the SQL query to do the actual polymorphic load must be constructed, usually as a UNION. There is a helper function to create these UNIONS called polymorphic_union.

Concrete Inheritance, Polymorphic
pjoin = polymorphic_union({
    'manager':managers_table,
    'engineer':engineers_table
}, 'type', 'pjoin')

employee_mapper = mapper(Employee, pjoin, polymorphic_on=pjoin.c.type)
manager_mapper = mapper(Manager, managers_table, inherits=employee_mapper, concrete=True, polymorphic_identity='manager')
engineer_mapper = mapper(Engineer, engineers_table, inherits=employee_mapper, concrete=True, polymorphic_identity='engineer')

A future release of SQLALchemy might better merge the generated UNION into the mapper construction phase.

back to section top
Multiple Table Inheritance

Like concrete table inheritance, this can be done non-polymorphically, or with a little more complexity, polymorphically:

Multiple Table Inheritance, Non-polymorphic
people = Table('people', metadata, 
   Column('person_id', Integer, primary_key=True),
   Column('name', String(50)),
   Column('type', String(30)))

engineers = Table('engineers', metadata, 
   Column('person_id', Integer, ForeignKey('people.person_id'), primary_key=True),
   Column('engineer_info', String(50)),
  )

managers = Table('managers', metadata, 
   Column('person_id', Integer, ForeignKey('people.person_id'), primary_key=True),
   Column('manager_data', String(50)),
   )

person_mapper = mapper(Person, people)
mapper(Engineer, engineers, inherits=person_mapper)
mapper(Manager, managers, inherits=person_mapper)

Polymorphic:

Multiple Table Inheritance, Polymorphic
person_join = polymorphic_union(
    {
        'engineer':people.join(engineers),
        'manager':people.join(managers),
        'person':people.select(people.c.type=='person'),
    }, None, 'pjoin')

person_mapper = mapper(Person, people, select_table=person_join, polymorphic_on=person_join.c.type, polymorphic_identity='person')
mapper(Engineer, engineers, inherits=person_mapper, polymorphic_identity='engineer')
mapper(Manager, managers, inherits=person_mapper, polymorphic_identity='manager')

The join condition in a multiple table inheritance relationship can be specified explicitly, using inherit_condition:

AddressUser.mapper = mapper(
        AddressUser,
        addresses_table, inherits=User.mapper, 
        inherit_condition=users_table.c.user_id==addresses_table.c.user_id
    )
back to section top
Mapping a Class against Multiple Tables

Mappers can be constructed against arbitrary relational units (called Selectables) as well as plain Tables. For example, The join keyword from the SQL package creates a neat selectable unit comprised of multiple tables, complete with its own composite primary key, which can be passed in to a mapper as the table.

# a class
class AddressUser(object):
    pass

# define a Join
j = join(users_table, addresses_table)

# map to it - the identity of an AddressUser object will be 
# based on (user_id, address_id) since those are the primary keys involved
m = mapper(AddressUser, j, properties={
    'user_id':[users_table.c.user_id, addresses_table.c.user_id]
})

A second example:

# many-to-many join on an association table
j = join(users_table, userkeywords, 
        users_table.c.user_id==userkeywords.c.user_id).join(keywords, 
           userkeywords.c.keyword_id==keywords.c.keyword_id)

# a class 
class KeywordUser(object):
    pass

# map to it - the identity of a KeywordUser object will be
# (user_id, keyword_id) since those are the primary keys involved
m = mapper(KeywordUser, j, properties={
    'user_id':[users_table.c.user_id, userkeywords.c.user_id],
    'keyword_id':[userkeywords.c.keyword_id, keywords.c.keyword_id]
})

In both examples above, "composite" columns were added as properties to the mappers; these are aggregations of multiple columns into one mapper property, which instructs the mapper to keep both of those columns set at the same value.

back to section top
Mapping a Class against Arbitary Selects

Similar to mapping against a join, a plain select() object can be used with a mapper as well. Below, an example select which contains two aggregate functions and a group_by is mapped to a class:

s = select([customers, 
            func.count(orders).label('order_count'), 
            func.max(orders.price).label('highest_order')],
            customers.c.customer_id==orders.c.customer_id,
            group_by=[c for c in customers.c]
            )
class Customer(object):
    pass

m = mapper(Customer, s)

Above, the "customers" table is joined against the "orders" table to produce a full row for each customer row, the total count of related rows in the "orders" table, and the highest price in the "orders" table, grouped against the full set of columns in the "customers" table. That query is then mapped against the Customer class. New instances of Customer will contain attributes for each column in the "customers" table as well as an "ordercount" and "highestorder" attribute. Updates to the Customer object will only be reflected in the "customers" table and not the "orders" table. This is because the primary keys of the "orders" table are not represented in this mapper and therefore the table is not affected by save or delete operations.

back to section top
Multiple Mappers for One Class

The first mapper created for a certain class is known as that class's "primary mapper." Other mappers can be created as well, these come in two varieties.

  • secondary mapper - this is a mapper that must be constructed with the keyword argument non_primary=True, and represents a load-only mapper. Objects that are loaded with a secondary mapper will have their save operation processed by the primary mapper. It is also invalid to add new relation()s to a non-primary mapper. To use this mapper with the Session, specify it to the query method:

example:

# primary mapper
mapper(User, users_table)

# make a secondary mapper to load User against a join
othermapper = mapper(User, users_table.join(someothertable), non_primary=True)

# select
result = session.query(othermapper).select()
  • entity name mapper - this is a mapper that is a fully functioning primary mapper for a class, which is distinguished from the regular primary mapper by an entity_name parameter. Instances loaded with this mapper will be totally managed by this new mapper and have no connection to the original one. Most methods on Session include an optional entity_name parameter in order to specify this condition.

example:

# primary mapper
mapper(User, users_table)

# make an entity name mapper that stores User objects in another table
mapper(User, alternate_users_table, entity_name='alt')

# make two User objects
user1 = User()
user2 = User()

# save one in in the "users" table
session.save(user1)

# save the other in the "alternate_users_table"
session.save(user2, entity_name='alt')

session.flush()

# select from the alternate mapper
session.query(User, entity_name='alt').select()
back to section top
Circular Mapping

Oftentimes it is necessary for two mappers to be related to each other. With a datamodel that consists of Users that store Addresses, you might have an Address object and want to access the "user" attribute on it, or have a User object and want to get the list of Address objects. The easiest way to do this is via the backref keyword described in Backreferences. Although even when backreferences are used, it is sometimes necessary to explicitly specify the relations on both mappers pointing to each other. To achieve this involves creating the first mapper by itself, then creating the second mapper referencing the first, then adding references to the first mapper to reference the second:

usermapper = mapper(User, users)
mapper(Address, addresses_table, properties={
    'user':relation(User)
})

usermapper.add_property('addresses', relation(Address))

Note that with a circular relationship as above, you cannot declare both relationships as "eager" relationships, since that produces a circular query situation which will generate a recursion exception. So what if you want to load an Address and its User eagerly? Just use eager options:

eagerquery = session.query(Address).options(eagerload('user'))
s = eagerquery.select(Address.c.address_id==12)
back to section top
Self Referential Mappers

A self-referential mapper is a mapper that is designed to operate with an adjacency list table. This is a table that contains one or more foreign keys back to itself, and is usually used to create hierarchical tree structures. SQLAlchemy's default model of saving items based on table dependencies is not sufficient in this case, as an adjacency list table introduces dependencies between individual rows. Fortunately, SQLAlchemy will automatically detect a self-referential mapper and do the extra lifting to make it work.

# define a self-referential table
trees = Table('treenodes', engine,
    Column('node_id', Integer, primary_key=True),
    Column('parent_node_id', Integer, ForeignKey('treenodes.node_id'), nullable=True),
    Column('node_name', String(50), nullable=False),
    )

# treenode class
class TreeNode(object):
    pass

# mapper defines "children" property, pointing back to TreeNode class,
# with the mapper unspecified.  it will point back to the primary 
# mapper on the TreeNode class.
TreeNode.mapper = mapper(TreeNode, trees, properties={
        'children' : relation(
                        TreeNode, 
                        cascade="all, delete-orphan"
                     ),
        }
    )

# or, specify the circular relationship after establishing the original mapper:
mymapper = mapper(TreeNode, trees)

mymapper.add_property('children', relation(
                        mymapper, 
                        cascade="all, delete-orphan"
                     ))

This kind of mapper goes through a lot of extra effort when saving and deleting items, to determine the correct dependency graph of nodes within the tree.

A self-referential mapper where there is more than one relationship on the table requires that all join conditions be explicitly spelled out. Below is a self-referring table that contains a "parentnodeid" column to reference parent/child relationships, and a "rootnodeid" column which points child nodes back to the ultimate root node:

# define a self-referential table with several relations
trees = Table('treenodes', engine,
    Column('node_id', Integer, primary_key=True),
    Column('parent_node_id', Integer, ForeignKey('treenodes.node_id'), nullable=True),
    Column('root_node_id', Integer, ForeignKey('treenodes.node_id'), nullable=True),
    Column('node_name', String(50), nullable=False),
    )

# treenode class
class TreeNode(object):
    pass

# define the "children" property as well as the "root" property
TreeNode.mapper = mapper(TreeNode, trees, properties={
        'children' : relation(
                        TreeNode, 
                        primaryjoin=trees.c.parent_node_id==trees.c.node_id
                        cascade="all, delete-orphan"
                     ),
        'root' : relation(
                TreeNode,
                primaryjoin=trees.c.root_node_id=trees.c.node_id, 
                foreignkey=trees.c.node_id,
                uselist=False
            )
        }
    )

The "root" property on a TreeNode is a many-to-one relationship. By default, a self-referential mapper declares relationships as one-to-many, so the extra parameter foreignkey, pointing to the remote side of a relationship, is needed to indicate a "many-to-one" self-referring relationship. Both TreeNode examples above are available in functional form in the examples/adjacencytree directory of the distribution.

back to section top
Result-Set Mapping

Take any result set and feed it into a mapper to produce objects. Multiple mappers can be combined to retrieve unrelated objects from the same row in one step. The instances method on mapper takes a ResultProxy object, which is the result type generated from SQLEngine, and delivers object instances.

class User(object):
    pass

User.mapper = mapper(User, users_table)

# select users
c = users_table.select().execute()

# get objects
userlist = User.mapper.instances(c)

# define a second class/mapper
class Address(object):
    pass

Address.mapper = mapper(Address, addresses_table)

# select users and addresses in one query
s = select([users_table, addresses_table], users_table.c.user_id==addresses_table.c.user_id)

# execute it, and process the results with the User mapper, chained to the Address mapper
r = User.mapper.instances(s.execute(), Address.mapper)

# result rows are an array of objects, one for each mapper used
for entry in r:
    user = r[0]
    address = r[1]
back to section top
Mapper Arguments

Other arguments not covered above include:

  • select_table=None - often used with polymorphic mappers, this is a Selectable which will take the place of the Mapper's main table argument when performing queries.
  • version_id_col=None - an integer-holding Column object that will be assigned an incrementing counter, which is added to the WHERE clause used by UPDATE and DELETE statements. The matching row count returned by the database is compared to the expected row count, and an exception is raised if they dont match. This is a basic "optimistic concurrency" check. Without the version id column, SQLAlchemy still compares the updated rowcount.
  • always_refresh=False - this option will cause the mapper to refresh all the attributes of all objects loaded by select/get statements, regardless of if they already exist in the current session. this includes all lazy- and eager-loaded relationship attributes, and will also overwrite any changes made to attributes on the column.
back to section top
Extending Mapper

Mappers can have functionality augmented or replaced at many points in its execution via the usage of the MapperExtension class. This class is just a series of "hooks" where various functionality takes place. An application can make its own MapperExtension objects, overriding only the methods it needs. Methods that are not overridden return the special value sqlalchemy.orm.mapper.EXT_PASS, which indicates the operation should proceed as normally.

class MapperExtension(object):
    def select_by(self, query, *args, **kwargs):
        """overrides the select_by method of the Query object"""
    def select(self, query, *args, **kwargs):
        """overrides the select method of the Query object"""
    def create_instance(self, mapper, session, row, imap, class_):
        """called when a new object instance is about to be created from a row.  
        the method can choose to create the instance itself, or it can return 
        None to indicate normal object creation should take place.

        mapper - the mapper doing the operation

        row - the result row from the database

        imap - a dictionary that is storing the running set of objects collected from the
        current result set

        class_ - the class we are mapping.
        """
    def append_result(self, mapper, session, row, imap, result, instance, isnew, populate_existing=False):
        """called when an object instance is being appended to a result list.

        If this method returns True, it is assumed that the mapper should do the appending, else
        if this method returns False, it is assumed that the append was handled by this method.

        mapper - the mapper doing the operation

        row - the result row from the database

        imap - a dictionary that is storing the running set of objects collected from the
        current result set

        result - an instance of util.HistoryArraySet(), which may be an attribute on an
        object if this is a related object load (lazy or eager).  use result.append_nohistory(value)
        to append objects to this list.

        instance - the object instance to be appended to the result

        isnew - indicates if this is the first time we have seen this object instance in the current result
        set.  if you are selecting from a join, such as an eager load, you might see the same object instance
        many times in the same result set.

        populate_existing - usually False, indicates if object instances that were already in the main 
        identity map, i.e. were loaded by a previous select(), get their attributes overwritten
        """
    def populate_instance(self, mapper, session, instance, row, identitykey, imap, isnew):
        """called right before the mapper, after creating an instance from a row, passes the row
        to its MapperProperty objects which are responsible for populating the object's attributes.
        If this method returns True, it is assumed that the mapper should do the appending, else
        if this method returns False, it is assumed that the append was handled by this method.

        Essentially, this method is used to have a different mapper populate the object:

            def populate_instance(self, mapper, session, instance, row, identitykey, imap, isnew):
                othermapper.populate_instance(session, instance, row, identitykey, imap, isnew, frommapper=mapper)
                return True
        """
    def before_insert(self, mapper, connection, instance):
        """called before an object instance is INSERTed into its table.

        this is a good place to set up primary key values and such that arent handled otherwise."""
    def before_update(self, mapper, connection, instance):
        """called before an object instnace is UPDATED"""
    def after_update(self, mapper, connection, instance):
        """called after an object instnace is UPDATED"""
    def after_insert(self, mapper, connection, instance):
        """called after an object instance has been INSERTed"""
    def before_delete(self, mapper, connection, instance):
        """called before an object instance is DELETEed"""
    def after_delete(self, mapper, connection, instance):
        """called after an object instance is DELETEed"""

To use MapperExtension, make your own subclass of it and just send it off to a mapper:

m = mapper(User, users_table, extension=MyExtension())

Multiple extensions will be chained together and processed in order; they are specified as a list:

m = mapper(User, users_table, extension=[ext1, ext2, ext3])
back to section top

View: Paged  |  One Page

The package sqlalchemy.types defines the datatype identifiers which may be used when defining metadata. This package includes a set of generic types, a set of SQL-specific subclasses of those types, and a small extension system used by specific database connectors to adapt these generic types into database-specific type objects.

Built-in Types

SQLAlchemy comes with a set of standard generic datatypes, which are defined as classes.

The standard set of generic types are:

package sqlalchemy.types
class String(TypeEngine):
    def __init__(self, length=None)

class Integer(TypeEngine)

class SmallInteger(Integer)

class Numeric(TypeEngine): 
    def __init__(self, precision=10, length=2)

class Float(Numeric):
    def __init__(self, precision=10)

# DateTime, Date and Time types deal with datetime objects from the Python datetime module
class DateTime(TypeEngine)

class Date(TypeEngine)

class Time(TypeEngine)

class Binary(TypeEngine): 
    def __init__(self, length=None)

class Boolean(TypeEngine)

# converts unicode strings to raw bytes
# as bind params, raw bytes to unicode as 
# rowset values, using the unicode encoding 
# setting on the engine (defaults to 'utf-8')
class Unicode(TypeDecorator):
    impl = String

# uses the pickle protocol to serialize data
# in/out of Binary columns
class PickleType(TypeDecorator):
    impl = Binary

More specific subclasses of these types are available, which various database engines may choose to implement specifically, allowing finer grained control over types:

class FLOAT(Numeric)
class TEXT(String)
class DECIMAL(Numeric)
class INT(Integer)
INTEGER = INT
class TIMESTAMP(DateTime)
class DATETIME(DateTime)
class CLOB(String)
class VARCHAR(String)
class CHAR(String)
class BLOB(Binary)
class BOOLEAN(Boolean)

When using a specific database engine, these types are adapted even further via a set of database-specific subclasses defined by the database engine. There may eventually be more type objects that are defined for specific databases. An example of this would be Postgres' Array type.

Type objects are specified to table meta data using either the class itself, or an instance of the class. Creating an instance of the class allows you to specify parameters for the type, such as string length, numerical precision, etc.:

mytable = Table('mytable', engine, 
    # define type using a class
    Column('my_id', Integer, primary_key=True), 

    # define type using an object instance
    Column('value', Number(7,4)) 
)
back to section top
Creating your Own Types

User-defined types can be created, to support either database-specific types, or customized pre-processing of query parameters as well as post-processing of result set data. You can make your own classes to perform these operations. To augment the behavior of a TypeEngine type, such as String, the TypeDecorator class is used:

import sqlalchemy.types as types

class MyType(types.TypeDecorator):
    """basic type that decorates String, prefixes values with "PREFIX:" on 
    the way in and strips it off on the way out."""
    impl = types.String
    def convert_bind_param(self, value, engine):
        return "PREFIX:" + value
    def convert_result_value(self, value, engine):
        return value[7:]

The Unicode and PickleType classes are instances of TypeDecorator already and can be subclassed directly.

To build a type object from scratch, which will not have a corresponding database-specific implementation, subclass TypeEngine:

import sqlalchemy.types as types

class MyType(types.TypeEngine):
    def __init__(self, precision = 8):
        self.precision = precision
    def get_col_spec(self):
        return "MYTYPE(%s)" % self.precision
    def convert_bind_param(self, value, engine):
        return value
    def convert_result_value(self, value, engine):
        return value
back to section top

This section describes the connection pool module of SQLAlchemy. The Pool object it provides is normally embedded within an Engine instance. For most cases, explicit access to the pool module is not required. However, the Pool object can be used on its own, without the rest of SA, to manage DBAPI connections; this section describes that usage. Also, this section will describe in more detail how to customize the pooling strategy used by an Engine.

At the base of any database helper library is a system of efficiently acquiring connections to the database. Since the establishment of a database connection is typically a somewhat expensive operation, an application needs a way to get at database connections repeatedly without incurring the full overhead each time. Particularly for server-side web applications, a connection pool is the standard way to maintain a "pool" of database connections which are used over and over again among many requests. Connection pools typically are configured to maintain a certain "size", which represents how many connections can be used simultaneously without resorting to creating more newly-established connections.

Establishing a Transparent Connection Pool

Any DBAPI module can be "proxied" through the connection pool using the following technique (note that the usage of 'psycopg2' is just an example; substitute whatever DBAPI module you'd like):

import sqlalchemy.pool as pool
import psycopg2 as psycopg
psycopg = pool.manage(psycopg)

# then connect normally
connection = psycopg.connect(database='test', username='scott', password='tiger')

This produces a sqlalchemy.pool.DBProxy object which supports the same connect() function as the original DBAPI module. Upon connection, a thread-local connection proxy object is returned, which delegates its calls to a real DBAPI connection object. This connection object is stored persistently within a connection pool (an instance of sqlalchemy.pool.Pool) that corresponds to the exact connection arguments sent to the connect() function. The connection proxy also returns a proxied cursor object upon calling connection.cursor(). When all cursors as well as the connection proxy are de-referenced, the connection is automatically made available again by the owning pool object.

Basically, the connect() function is used in its usual way, and the pool module transparently returns thread-local pooled connections. Each distinct set of connect arguments corresponds to a brand new connection pool created; in this way, an application can maintain connections to multiple schemas and/or databases, and each unique connect argument set will be managed by a different pool object.

back to section top
Connection Pool Configuration

When proxying a DBAPI module through the pool module, options exist for how the connections should be pooled:

  • echo=False : if set to True, connections being pulled and retrieved from/to the pool will be logged to the standard output, as well as pool sizing information.
  • use_threadlocal=True : if set to True, repeated calls to connect() within the same application thread will be guaranteed to return the same connection object, if one has already been retrieved from the pool and has not been returned yet. This allows code to retrieve a connection from the pool, and then while still holding on to that connection, to call other functions which also ask the pool for a connection of the same arguments; those functions will act upon the same connection that the calling method is using. Note that once the connection is returned to the pool, it then may be used by another thread. To guarantee a single unique connection per thread that never changes, use the option poolclass=SingletonThreadPool, in which case the use_threadlocal parameter is automatically set to False.
  • poolclass=QueuePool : the Pool class used by the pool module to provide pooling. QueuePool uses the Python Queue.Queue class to maintain a list of available connections. A developer can supply his or her own Pool class to supply a different pooling algorithm. Also included is the SingletonThreadPool, which provides a single distinct connection per thread and is required with SQLite.
  • pool_size=5 : used by QueuePool - the size of the pool to be maintained. This is the largest number of connections that will be kept persistently in the pool. Note that the pool begins with no connections; once this number of connections is requested, that number of connections will remain.
  • max_overflow=10 : used by QueuePool - the maximum overflow size of the pool. When the number of checked-out connections reaches the size set in pool_size, additional connections will be returned up to this limit. When those additional connections are returned to the pool, they are disconnected and discarded. It follows then that the total number of simultaneous connections the pool will allow is pool_size + max_overflow, and the total number of "sleeping" connections the pool will allow is pool_size. max_overflow can be set to -1 to indicate no overflow limit; no limit will be placed on the total number of concurrent connections.
  • timeout=30 : used by QueuePool - the timeout before giving up on returning a connection, if none are available and the max_overflow has been reached.
back to section top
Custom Pool Construction

One level below using a DBProxy to make transparent pools is creating the pool yourself. The pool module comes with two implementations of connection pools: QueuePool and SingletonThreadPool. While QueuePool uses Queue.Queue to provide connections, SingletonThreadPool provides a single per-thread connection which SQLite requires.

Constructing your own pool involves passing a callable used to create a connection. Through this method, custom connection schemes can be made, such as a connection that automatically executes some initialization commands to start. The options from the previous section can be used as they apply to QueuePool or SingletonThreadPool.

Plain QueuePool
import sqlalchemy.pool as pool
import psycopg2

def getconn():
    c = psycopg2.connect(username='ed', host='127.0.0.1', dbname='test')
    # execute an initialization function on the connection before returning
    c.cursor.execute("setup_encodings()")
    return c

p = pool.QueuePool(getconn, max_overflow=10, pool_size=5, use_threadlocal=True)

Or with SingletonThreadPool:

SingletonThreadPool
import sqlalchemy.pool as pool
import sqlite

def getconn():
    return sqlite.connect(filename='myfile.db')

# SQLite connections require the SingletonThreadPool    
p = pool.SingletonThreadPool(getconn)
back to section top

SQLAlchemy has a variety of extensions and "mods" available which provide extra functionality to SA, either via explicit usage or by augmenting the core behavior.

threadlocal

Author: Mike Bayer and Daniel Miller

Establishes threadlocal as the default strategy for new ComposedSQLEngine objects, installs a threadlocal SessionContext that is attached to all Mappers via a global MapperExtension, and establishes the global SessionContext under the name sqlalchemy.objectstore. Usually this is used in combination with Tables that are associated with BoundMetaData or DynamicMetaData, so that the Session does not need to be bound to any Engine explicitly.

import sqlalchemy.mods.threadlocal
from sqlalchemy import *

metadata = BoundMetaData('sqlite:///')
user_table = Table('users', metadata,
    Column('user_id', Integer, primary_key=True),
    Column('user_name', String(50), nullable=False)
)

class User(object):
    pass
mapper(User, user_table)

# thread local session
session = objectstore.get_session()

# "user" object is added to the session automatically
user = User()

session.flush()
get_session() Implemented on All Mappers

All Mapper objects constructed after the threadlocal import will receive a default MapperExtension which implements the get_session() method, returning the Session that is associated with the current thread by the global SessionContext. All newly constructed objects will automatically be attached to the Session corresponding to the current thread, i.e. they will skip the "transient" state and go right to "pending".

This occurs because when a Mapper is first constructed for a class, it decorates the classes' __init__() method in a manner like the following:

oldinit = class_.__init__   # the previous init method
def __init__(self):
    session = ext.get_session() # get Session from this Mapper's MapperExtension
    if session is EXT_PASS:
        session = None
    if session is not None:
        session.save(self)  # attach to the current session
    oldinit(self)   # call previous init method

An instance can be redirected at construction time to a different Session by specifying the keyword parameter _sa_session:

session = create_session()  # create a new session distinct from the thread-local session
myuser = User(_sa_session=session)  # make a new User that is saved to this session

Similarly, the entity_name parameter, which specifies an alternate Mapper to be used when attaching this instance to the Session, can be specified via _sa_entity_name:

myuser = User(_sa_session=session, _sa_entity_name='altentity')
back to section top
Default Query Objects

The MapperExtension object's get_session() method is also used by the Query object to locate a Session with which to store newly loaded instances, if the Query is not already associated with a specific Session. As a result, the Query can be constructed standalone from a mapper or class:

# create a Query from a class
query = Query(User)

# specify entity name
query = Query(User, entity_name='foo')

# create a Query from a mapper
query = Query(mapper)
back to section top
objectstore Namespace

The objectstore is an instance of SessionContext, available in the sqlalchemy namespace which provides a proxy to the underlying Session bound to the current thread. objectstore can be treated just like the Session itself:

objectstore.save(instance)
objectstore.flush()

objectstore.clear()
back to section top
Attaching Mappers to their Class

With get_session() handling the details of providing a Session in all cases, the assign_mapper function provides some of the functionality of Query and Session directly off the mapped instances themselves. This is a "monkeypatch" function that creates a primary mapper, attaches the mapper to the class, and also the methods get, select, select_by, selectone, get_by, join_to, join_via, flush, delete, expire, refresh, expunge, merge, update, save, and save_or_update:

# "assign" a mapper to the User class/users table
assign_mapper(User, users)

# methods are attached to the class for selecting
userlist = User.select_by(user_id=12)

myuser = User.get(1)

# mark an object as deleted for the next commit
myuser.delete()

# flush the changes on a specific object
myotheruser.flush()

A more generic version of assign_mapper that works with any SessionContext is available in the assignmapper plugin.

back to section top
Engine Strategy Set to threadlocal By Default

The threadlocal mod also establishes threadlocal as the default strategy when calling the create_engine() function. This strategy is specified by the strategy keyword argument to create_engine() and can still be overridden to be "plain" or "threadlocal" explicitly.

An Engine created with the threadlocal strategy will use a thread-locally managed connection object for all implicit statement executions and schema operations. Recall from Database Engines that an implicit execution is an execution where the Connection object is opened and closed internally, and the connect() method on Engine is not used; such as:

result = table.select().execute()

Above, the result variable holds onto a ResultProxy which is still referencing a connection returned by the connection pool. threadlocal strategy means that a second execute() statement in the same thread will use the same connection as the one referenced by result, assuming result is still referenced in memory.

The Mapper, Session, and Query implementations work equally well with either the default or threadlocal engine strategies. However, using the threadlocal strategy means that Session operations will use the same underlying connection as that of straight execute() calls with constructed SQL objects:

# assume "threadlocal" strategy is enabled, and there is no transaction in progress

result = table.select().execute()   # 'result' references a DBAPI connection, bound to the current thread

object = session.select()           # the 'select' operation also uses the current thread's connection,
                                    # i.e. the same connection referenced by 'result'

result.close()                      # return the connection to the pool.  now there is no connection 
                                    # associated with the current thread.  the next execution will re-check out a 
                                    # connection and re-attach to the current thread.
back to section top
SessionContext

Author: Daniel Miller

This plugin is a generalized version of the objectstore object provided by the threadlocal plugin:

import sqlalchemy
from sqlalchemy.ext.sessioncontext import SessionContext

ctx = SessionContext(sqlalchemy.create_session)

class User(object):
    pass

mapper(User, users_table, extension=ctx.mapperextension)

# 'u' is automatically added to the current session of 'ctx'
u = User()

# get the current session and flush
ctx.current.flush()

The construction of each Session instance can be customized by providing a "creation function" which returns a new Session. The "scope" to which the session is associated, which by default is the current thread, can be customized by providing a "scope callable" which returns a hashable key that represents the current scope:

import sqlalchemy
from sqlalchemy.ext.sessioncontext import SessionContext

# create an engine
someengine = sqlalchemy.create_engine('sqlite:///')

# a function to return a Session bound to our engine
def make_session():
    return sqlalchemy.create_session(bind_to=someengine)

# global declaration of "scope"
scope = "scope1"

# a function to return the current "session scope"
def global_scope_func():
    return scope

# create SessionContext with our two functions
ctx = SessionContext(make_session, scopefunc=global_scope_func)

# get the session corresponding to "scope1", bound to engine "someengine":
session = ctx.current

# switch the "scope"
scope = "scope2"

# get the session corresponding to "scope2", bound to engine "someengine":
session = ctx.current
back to section top
assignmapper

Author: Mike Bayer

This is a generic version of the assign_mapper function present in the threadlocal mod. It requires an explicit SessionContext.

import sqlalchemy
from sqlalchemy.ext.sessioncontext import SessionContext
from sqlalchemy.ext.assignmapper import assign_mapper

# session context
ctx = SessionContext(sqlalchemy.create_session)

# assign mapper to class MyClass using table 'sometable', getting
# Sessions from 'ctx'.
assign_mapper(ctx, MyClass, sometable)
back to section top
ActiveMapper

Author: Jonathan LaCour

ActiveMapper is a so-called "declarative layer" which allows the construction of a class, a Table, and a Mapper all in one step:

class Person(ActiveMapper):
    class mapping:
        id          = column(Integer, primary_key=True)
        full_name   = column(String)
        first_name  = column(String)
        middle_name = column(String)
        last_name   = column(String)
        birth_date  = column(DateTime)
        ssn         = column(String)
        gender      = column(String)
        home_phone  = column(String)
        cell_phone  = column(String)
        work_phone  = column(String)
        prefs_id    = column(Integer, foreign_key=ForeignKey('preferences.id'))
        addresses   = one_to_many('Address', colname='person_id', backref='person')
        preferences = one_to_one('Preferences', colname='pref_id', backref='person')

    def __str__(self):
        s =  '%s\n' % self.full_name
        s += '  * birthdate: %s\n' % (self.birth_date or 'not provided')
        s += '  * fave color: %s\n' % (self.preferences.favorite_color or 'Unknown')
        s += '  * personality: %s\n' % (self.preferences.personality_type or 'Unknown')

        for address in self.addresses:
            s += '  * address: %s\n' % address.address_1
            s += '             %s, %s %s\n' % (address.city, address.state, address.postal_code)

        return s

class Preferences(ActiveMapper):
    class mapping:
        __table__        = 'preferences'
        id               = column(Integer, primary_key=True)
        favorite_color   = column(String)
        personality_type = column(String)

class Address(ActiveMapper):
    class mapping:
        id          = column(Integer, primary_key=True)
        type        = column(String)
        address_1   = column(String)
        city        = column(String)
        state       = column(String)
        postal_code = column(String)
        person_id   = column(Integer, foreign_key=ForeignKey('person.id'))

More discussion on ActiveMapper can be found at Jonathan LaCour's Blog as well as the SQLAlchemy Wiki.

back to section top
SqlSoup

Author: Jonathan Ellis

SqlSoup creates mapped classes on the fly from tables, which are automatically reflected from the database based on name. It is essentially a nicer version of the "row data gateway" pattern.

>>> from sqlalchemy.ext.sqlsoup import SqlSoup
>>> soup = SqlSoup('sqlite:///')

>>> users = soup.users.select()
>>> users.sort()
>>> users
[Class_Users(name='Bhargan Basepair',email='basepair@example.edu',password='basepair',classname=None,admin=1),
 Class_Users(name='Joe Student',email='student@example.edu',password='student',classname=None,admin=0)]

Read more about SqlSoup on Jonathan Ellis' Blog.

back to section top
ProxyEngine

Author: Jason Pellerin

The ProxyEngine is used to "wrap" an Engine, and via subclassing ProxyEngine one can instrument the functionality of an arbitrary Engine instance through the decorator pattern. It also provides a connect() method which will send all Engine requests to different underlying engines. Its functionality in that regard is largely superceded now by DynamicMetaData which is a better solution.

from sqlalchemy.ext.proxy import ProxyEngine
proxy = ProxyEngine()

proxy.connect('postgres://user:pw@host/db')
back to section top
SelectResults

Author: Jonas Borgström

SelectResults gives transformative behavior to the results returned from the select and select_by method of Query. It supports three modes of operation; per-query, per-mapper, and per-application.

SelectResults with a Query Object
from sqlalchemy.ext.selectresults import SelectResults

query = session.query(MyClass)
res = SelectResults(query, table.c.column == "something")
res = res.order_by([table.c.column]) #add an order clause

for x in res[:10]:  # Fetch and print the top ten instances
  print x.column2

x = list(res) # execute the query

# Count how many instances that have column2 > 42
# and column == "something"
print res.filter(table.c.column2 > 42).count()

Per mapper:

SelectResults with a Mapper Object
from sqlalchemy.ext.selectresults import SelectResultsExt
mapper(MyClass, mytable, extension=SelectResultsExt())
session.query(MyClass).select(mytable.c.column=="something").order_by([mytable.c.column])[2:7]

Or across an application via the selectresults mod:

SelectResults via mod
import sqlalchemy.mods.selectresults

mapper(MyClass, mytable)
session.query(MyClass).select(mytable.c.column=="something").order_by([mytable.c.column])[2:7]

For a full listing of methods, see the generated documentation.

back to section top

View: Paged  |  One Page
Modules and Classes
Module sqlalchemy.sql

defines the base components of SQL expression trees.

Module Functions
def alias(*args, **params)

def and_(*clauses)

joins a list of clauses together by the AND operator. the & operator can be used as well.

def asc(column)

returns an ascending ORDER BY clause element, e.g.:

order_by = [asc(table1.mycol)]

def between_(ctest, cleft, cright)

returns BETWEEN predicate clause (clausetest BETWEEN clauseleft AND clauseright)

def bindparam(key, value=None, type=None)

creates a bind parameter clause with the given key.

An optional default value can be specified by the value parameter, and the optional type parameter is a sqlalchemy.types.TypeEngine object which indicates bind-parameter and result-set translation for this bind parameter.

def case(whens, value=None, else_=None)

SQL CASE statement -- whens are a sequence of pairs to be translated into "when / then" clauses; optional [value] for simple case statements, and [else_] for case defaults

def cast(clause, totype, **kwargs)

returns CAST function CAST(clause AS totype) Use with a sqlalchemy.types.TypeEngine object, i.e cast(table.c.unit_price * table.c.qty, Numeric(10,4)) or cast(table.c.timestamp, DATE)

def column(text, table=None, type=None)

returns a textual column clause, relative to a table. this is also the primitive version of a schema.Column which is a subclass.

def delete(table, whereclause=None, **kwargs)

returns a DELETE clause element.

This can also be called from a table directly via the table's delete() method.

'table' is the table to be updated. 'whereclause' is a ClauseElement describing the WHERE condition of the UPDATE statement.

def desc(column)

returns a descending ORDER BY clause element, e.g.:

order_by = [desc(table1.mycol)]

def exists(*args, **params)

def insert(table, values=None, **kwargs)

returns an INSERT clause element.

This can also be called from a table directly via the table's insert() method.

'table' is the table to be inserted into.

'values' is a dictionary which specifies the column specifications of the INSERT, and is optional. If left as None, the column specifications are determined from the bind parameters used during the compile phase of the INSERT statement. If the bind parameters also are None during the compile phase, then the column specifications will be generated from the full list of table columns.

If both 'values' and compile-time bind parameters are present, the compile-time bind parameters override the information specified within 'values' on a per-key basis.

The keys within 'values' can be either Column objects or their string identifiers. Each key may reference one of: a literal data value (i.e. string, number, etc.), a Column object, or a SELECT statement. If a SELECT statement is specified which references this INSERT statement's table, the statement will be correlated against the INSERT statement.

def join(left, right, onclause=None, **kwargs)

returns a JOIN clause element (regular inner join), given the left and right hand expressions, as well as the ON condition's expression. To chain joins together, use the resulting Join object's "join()" or "outerjoin()" methods.

def literal(value, type=None)

returns a literal clause, bound to a bind parameter.

literal clauses are created automatically when used as the right-hand side of a boolean or math operation against a column object. use this function when a literal is needed on the left-hand side (and optionally on the right as well).

the optional type parameter is a sqlalchemy.types.TypeEngine object which indicates bind-parameter and result-set translation for this literal.

def not_(clause)

returns a negation of the given clause, i.e. NOT(clause). the ~ operator can be used as well.

def null()

returns a Null object, which compiles to NULL in a sql statement.

def or_(*clauses)

joins a list of clauses together by the OR operator. the | operator can be used as well.

def outerjoin(left, right, onclause=None, **kwargs)

returns an OUTER JOIN clause element, given the left and right hand expressions, as well as the ON condition's expression. To chain joins together, use the resulting Join object's "join()" or "outerjoin()" methods.

def select(columns=None, whereclause=None, from_obj=[], **kwargs)

returns a SELECT clause element.

this can also be called via the table's select() method.

'columns' is a list of columns and/or selectable items to select columns from 'whereclause' is a text or ClauseElement expression which will form the WHERE clause 'from_obj' is an list of additional "FROM" objects, such as Join objects, which will extend or override the default "from" objects created from the column list and the whereclause. **kwargs - additional parameters for the Select object.

def subquery(alias, *args, **kwargs)

def table(name, *columns)

returns a table clause. this is a primitive version of the schema.Table object, which is a subclass of this object.

def text(text, engine=None, *args, **kwargs)

creates literal text to be inserted into a query.

When constructing a query from a select(), update(), insert() or delete(), using plain strings for argument values will usually result in text objects being created automatically. Use this function when creating textual clauses outside of other ClauseElement objects, or optionally wherever plain text is to be used.

Arguments include:

text - the text of the SQL statement to be created. use :<param> to specify bind parameters; they will be compiled to their engine-specific format.

engine - an optional engine to be used for this text query.

bindparams - a list of bindparam() instances which can be used to define the types and/or initial values for the bind parameters within the textual statement; the keynames of the bindparams must match those within the text of the statement. The types will be used for pre-processing on bind values.

typemap - a dictionary mapping the names of columns represented in the SELECT clause of the textual statement to type objects, which will be used to perform post-processing on columns within the result set (for textual statements that produce result sets).

def union(*selects, **params)

def union_all(*selects, **params)

def update(table, whereclause=None, values=None, **kwargs)

returns an UPDATE clause element.

This can also be called from a table directly via the table's update() method.

'table' is the table to be updated. 'whereclause' is a ClauseElement describing the WHERE condition of the UPDATE statement. 'values' is a dictionary which specifies the SET conditions of the UPDATE, and is optional. If left as None, the SET conditions are determined from the bind parameters used during the compile phase of the UPDATE statement. If the bind parameters also are None during the compile phase, then the SET conditions will be generated from the full list of table columns.

If both 'values' and compile-time bind parameters are present, the compile-time bind parameters override the information specified within 'values' on a per-key basis.

The keys within 'values' can be either Column objects or their string identifiers. Each key may reference one of: a literal data value (i.e. string, number, etc.), a Column object, or a SELECT statement. If a SELECT statement is specified which references this UPDATE statement's table, the statement will be correlated against the UPDATE statement.

back to section top
Class Engine(object)

represents a 'thing that can produce Compiled objects and execute them'.

def compiler(self, statement, parameters, **kwargs)

def execute_compiled(self, compiled, parameters, echo=None, **kwargs)

back to section top
Class AbstractDialect(object)

represents the behavior of a particular database. Used by Compiled objects.

back to section top
Class ClauseParameters(OrderedDict)

represents a dictionary/iterator of bind parameter key names/values. Includes parameters compiled with a Compiled object as well as additional arguments passed to the Compiled object's get_params() method. Parameter values will be converted as per the TypeEngine objects present in the bind parameter objects. The non-converted value can be retrieved via the get_original method. For Compiled objects that compile positional parameters, the values() iteration of the object will return the parameter values in the correct order.

def __init__(self, dialect)

def get_original(self, key)

returns the given parameter as it was originally placed in this ClauseParameters object, without any Type conversion

def get_original_dict(self)

def get_raw_dict(self)

def set_parameter(self, key, value, bindparam)

def values(self)

back to section top
Class Compiled(ClauseVisitor)

represents a compiled SQL expression. the __str__ method of the Compiled object should produce the actual text of the statement. Compiled objects are specific to the database library that created them, and also may or may not be specific to the columns referenced within a particular set of bind parameters. In no case should the Compiled object be dependent on the actual values of those bind parameters, even though it may reference those values as defaults.

def __init__(self, dialect, statement, parameters, engine=None)

constructs a new Compiled object.

statement - ClauseElement to be compiled

parameters - optional dictionary indicating a set of bind parameters specified with this Compiled object. These parameters are the "default" values corresponding to the ClauseElement's BindParamClauses when the Compiled is executed. In the case of an INSERT or UPDATE statement, these parameters will also result in the creation of new BindParamClause objects for each key and will also affect the generated column list in an INSERT statement and the SET clauses of an UPDATE statement. The keys of the parameter dictionary can either be the string names of columns or ColumnClause objects.

engine - optional Engine to compile this statement against

def compile(self)

def execute(self, *multiparams, **params)

executes this compiled object using the AbstractEngine it is bound to.

def get_params(self, **params)

returns the bind params for this compiled object.

Will start with the default parameters specified when this Compiled object was first constructed, and will override those values with those sent via **params, which are key/value pairs. Each key should match one of the BindParamClause objects compiled into this object; either the "key" or "shortname" property of the BindParamClause.

def scalar(self, *multiparams, **params)

executes this compiled object via the execute() method, then returns the first column of the first row. Useful for executing functions, sequences, rowcounts, etc.

back to section top
Class ClauseElement(object)

base class for elements of a programmatically constructed SQL expression.

def accept_visitor(self, visitor)

accepts a ClauseVisitor and calls the appropriate visit_xxx method.

def compare(self, other)

compares this ClauseElement to the given ClauseElement.

Subclasses should override the default behavior, which is a straight identity comparison.

def compile(self, engine=None, parameters=None, compiler=None, dialect=None)

compiles this SQL expression.

Uses the given Compiler, or the given AbstractDialect or Engine to create a Compiler. If no compiler arguments are given, tries to use the underlying Engine this ClauseElement is bound to to create a Compiler, if any. Finally, if there is no bound Engine, uses an ANSIDialect to create a default Compiler.

bindparams is a dictionary representing the default bind parameters to be used with the statement. if the bindparams is a list, it is assumed to be a list of dictionaries and the first dictionary in the list is used with which to compile against. The bind parameters can in some cases determine the output of the compilation, such as for UPDATE and INSERT statements the bind parameters that are present determine the SET and VALUES clause of those statements.

def copy_container(self)

should return a copy of this ClauseElement, iff this ClauseElement contains other ClauseElements. Otherwise, it should be left alone to return self. This is used to create copies of expression trees that still reference the same "leaf nodes". The new structure can then be restructured without affecting the original.

engine = property()

attempts to locate a Engine within this ClauseElement structure, or returns None if none found.

def execute(self, *multiparams, **params)

def execute_using(self, engine, *multiparams, **params)

def is_selectable(self)

returns True if this ClauseElement is Selectable, i.e. it contains a list of Column objects and can be used as the target of a select statement.

def scalar(self, *multiparams, **params)

def scalar_using(self, engine, *multiparams, **params)

def using(self, abstractengine)

back to section top
Class TableClause(FromClause)

def __init__(self, name, *columns)

def accept_visitor(self, visitor)

def alias(self, name=None)

def append_column(self, c)

c = property()

columns = property()

def count(self, whereclause=None, **params)

def delete(self, whereclause=None)

foreign_keys = property()

indexes = property()

def insert(self, values=None)

def join(self, right, *args, **kwargs)

def named_with_column(self)

original_columns = property()

def outerjoin(self, right, *args, **kwargs)

primary_key = property()

def select(self, whereclause=None, **params)

def update(self, whereclause=None, values=None)

back to section top
Class ColumnClause(ColumnElement)

represents a textual column clause in a SQL statement. May or may not be bound to an underlying Selectable.

def __init__(self, text, selectable=None, type=None, hidden=False)

def accept_visitor(self, visitor)

def to_selectable(self, selectable)

given a Selectable, returns this column's equivalent in that Selectable, if any.

for example, this could translate the column "name" from a Table object to an Alias of a Select off of that Table object.

back to section top
Module sqlalchemy.schema

the schema module provides the building blocks for database metadata. This means all the entities within a SQL database that we might want to look at, modify, or create and delete are described by these objects, in a database-agnostic way.

A structure of SchemaItems also provides a "visitor" interface which is the primary method by which other methods operate upon the schema. The SQL package extends this structure with its own clause-specific objects as well as the visitor interface, so that the schema package "plugs in" to the SQL package.

Class BoundMetaData(MetaData)

builds upon MetaData to provide the capability to bind to an Engine implementation.

def __init__(self, engine_or_url, name=None, **kwargs)

def is_bound(self)

back to section top
Class Column(SchemaItem,ColumnClause)

represents a column in a database table. this is a subclass of sql.ColumnClause and represents an actual existing table in the database, in a similar fashion as TableClause/Table.

def __init__(self, name, type, *args, **kwargs)

constructs a new Column object. Arguments are:

name : the name of this column. this should be the identical name as it appears, or will appear, in the database.

type : this is the type of column. This can be any subclass of types.TypeEngine, including the database-agnostic types defined in the types module, database-specific types defined within specific database modules, or user-defined types.

*args : ForeignKey and Sequence objects should be added as list values.

**kwargs : keyword arguments include:

key=None : an optional "alias name" for this column. The column will then be identified everywhere in an application, including the column list on its Table, by this key, and not the given name. Generated SQL, however, will still reference the column by its actual name.

primary_key=False : True if this column is a primary key column. Multiple columns can have this flag set to specify composite primary keys.

nullable=True : True if this column should allow nulls. Defaults to True unless this column is a primary key column.

default=None : a scalar, python callable, or ClauseElement representing the "default value" for this column, which will be invoked upon insert if this column is not present in the insert list or is given a value of None.

hidden=False : indicates this column should not be listed in the table's list of columns. Used for the "oid" column, which generally isnt in column lists.

index=None : True or index name. Indicates that this column is indexed. Pass true to autogenerate the index name. Pass a string to specify the index name. Multiple columns that specify the same index name will all be included in the index, in the order of their creation.

unique=None : True or index name. Indicates that this column is indexed in a unique index . Pass true to autogenerate the index name. Pass a string to specify the index name. Multiple columns that specify the same index name will all be included in the index, in the order of their creation.

def accept_schema_visitor(self, visitor)

traverses the given visitor to this Column's default and foreign key object, then calls visit_column on the visitor.

def append_item(self, item)

columns = property()

def copy(self)

creates a copy of this Column, unitialized

back to section top
Class ColumnDefault(DefaultGenerator)

A plain default value on a column. this could correspond to a constant, a callable function, or a SQL clause.

def __init__(self, arg, **kwargs)

def accept_schema_visitor(self, visitor)

calls the visit_column_default method on the given visitor.

back to section top
Class DynamicMetaData(MetaData)

builds upon MetaData to provide the capability to bind to multiple Engine implementations on a dynamically alterable, thread-local basis.

def __init__(self, name=None, threadlocal=True)

def connect(self, engine_or_url, **kwargs)

def dispose(self)

disposes all Engines to which this DynamicMetaData has been connected.

engine = property()

def is_bound(self)

back to section top
Class ForeignKey(SchemaItem)

defines a ForeignKey constraint between two columns. ForeignKey is specified as an argument to a Column object.

def __init__(self, column)

Constructs a new ForeignKey object. "column" can be a schema.Column object representing the relationship, or just its string name given as "tablename.columnname". schema can be specified as "schema.tablename.columnname"

def accept_schema_visitor(self, visitor)

calls the visit_foreign_key method on the given visitor.

column = property()

def copy(self)

produces a copy of this ForeignKey object.

def references(self, table)

returns True if the given table is referenced by this ForeignKey.

back to section top
Class Index(SchemaItem)

Represents an index of columns from a database table

def __init__(self, name, *columns, **kw)

Constructs an index object. Arguments are:

name : the name of the index

*columns : columns to include in the index. All columns must belong to the same table, and no column may appear more than once.

**kw : keyword arguments include:

unique=True : create a unique index

def accept_schema_visitor(self, visitor)

def append_column(self, column)

def create(self, engine=None)

def drop(self, engine=None)

back to section top
Class MetaData(SchemaItem)

represents a collection of Tables and their associated schema constructs.

def __init__(self, name=None)

def clear(self)

def create_all(self, engine=None, tables=None)

def drop_all(self, engine=None, tables=None)

def is_bound(self)

def table_iterator(self, reverse=True)

back to section top
Class PassiveDefault(DefaultGenerator)

a default that takes effect on the database side

def __init__(self, arg, **kwargs)

def accept_schema_visitor(self, visitor)

back to section top
Class SchemaItem(object)

base class for items that define a database schema.

engine = property()

metadata = property()

back to section top
Class SchemaVisitor(ClauseVisitor)

defines the visiting for SchemaItem objects

def visit_column(self, column)

visit a Column.

def visit_column_default(self, default)

visit a ColumnDefault.

def visit_column_onupdate(self, onupdate)

visit a ColumnDefault with the "for_update" flag set.

def visit_foreign_key(self, join)

visit a ForeignKey.

def visit_index(self, index)

visit an Index.

def visit_passive_default(self, default)

visit a passive default

def visit_schema(self, schema)

visit a generic SchemaItem

def visit_sequence(self, sequence)

visit a Sequence.

def visit_table(self, table)

visit a Table.

back to section top
Class Sequence(DefaultGenerator)

represents a sequence, which applies to Oracle and Postgres databases.

def __init__(self, name, start=None, increment=None, optional=False, **kwargs)

def accept_schema_visitor(self, visitor)

calls the visit_seauence method on the given visitor.

def create(self)

def drop(self)

back to section top
Class Table(SchemaItem,TableClause)

represents a relational database table. This subclasses sql.TableClause to provide a table that is "wired" to an engine. Whereas TableClause represents a table as its used in a SQL expression, Table represents a table as its created in the database.

Be sure to look at sqlalchemy.sql.TableImpl for additional methods defined on a Table.

def __init__(self, name, metadata, **kwargs)

Table objects can be constructed directly. The init method is actually called via the TableSingleton metaclass. Arguments are:

name : the name of this table, exactly as it appears, or will appear, in the database. This property, along with the "schema", indicates the "singleton identity" of this table. Further tables constructed with the same name/schema combination will return the same Table instance.

*args : should contain a listing of the Column objects for this table.

**kwargs : options include:

schema=None : the "schema name" for this table, which is required if the table resides in a schema other than the default selected schema for the engine's database connection.

autoload=False : the Columns for this table should be reflected from the database. Usually there will be no Column objects in the constructor if this property is set.

redefine=False : if this Table has already been defined in the application, clear out its columns and redefine with new arguments.

mustexist=False : indicates that this Table must already have been defined elsewhere in the application, else an exception is raised.

useexisting=False : indicates that if this Table was already defined elsewhere in the application, disregard the rest of the constructor arguments. If this flag and the "redefine" flag are not set, constructing the same table twice will result in an exception.

def accept_schema_visitor(self, visitor)

traverses the given visitor across the Column objects inside this Table, then calls the visit_table method on the visitor.

def append_column(self, column)

def append_index(self, index)

def append_index_column(self, column, index=None, unique=None)

Add an index or a column to an existing index of the same name.

def append_item(self, item)

appends a Column item or other schema item to this Table.

def create(self, connectable=None)

def deregister(self)

removes this table from it's metadata. this does not issue a SQL DROP statement.

def drop(self, connectable=None)

def reload_values(self, *args)

clears out the columns and other properties of this Table, and reloads them from the given argument list. This is used with the "redefine" keyword argument sent to the metaclass constructor.

def tometadata(self, metadata, schema=None)

returns a singleton instance of this Table with a different Schema

back to section top
Module sqlalchemy.engine

Module Functions
def create_engine(*args, **kwargs)

creates a new Engine instance. Using the given strategy name, locates that strategy and invokes its create() method to produce the Engine. The strategies themselves are instances of EngineStrategy, and the built in ones are present in the sqlalchemy.engine.strategies module. Current implementations include "plain" and "threadlocal". The default used by this function is "threadlocal".

"plain" provides support for a Connection object which can be used to execute SQL queries with a specific underlying DBAPI connection.

"threadlocal" is similar to "plain" except that it adds support for a thread-local connection and transaction context, which allows a group of engine operations to participate using the same connection and transaction without the need for explicit passing of a Connection object.

The standard method of specifying the engine is via URL as the first positional argument, to indicate the appropriate database dialect and connection arguments, with additional keyword arguments sent as options to the dialect and resulting Engine.

The URL is in the form <dialect>://opt1=val1&opt2=val2. Where <dialect> is a name such as "mysql", "oracle", "postgres", and the options indicate username, password, database, etc. Supported keynames include "username", "user", "password", "pw", "db", "database", "host", "filename".

**kwargs represents options to be sent to the Engine itself as well as the components of the Engine, including the Dialect, the ConnectionProvider, and the Pool. A list of common options is as follows:

pool=None : an instance of sqlalchemy.pool.DBProxy or sqlalchemy.pool.Pool to be used as the underlying source for connections (DBProxy/Pool is described in the previous section). If None, a default DBProxy will be created using the engine's own database module with the given arguments.

echo=False : if True, the Engine will log all statements as well as a repr() of their parameter lists to the engines logger, which defaults to sys.stdout. A Engine instances' "echo" data member can be modified at any time to turn logging on and off. If set to the string 'debug', result rows will be printed to the standard output as well.

logger=None : a file-like object where logging output can be sent, if echo is set to True. This defaults to sys.stdout.

encoding='utf-8' : the encoding to be used when encoding/decoding Unicode strings

convert_unicode=False : True if unicode conversion should be applied to all str types

module=None : used by Oracle and Postgres, this is a reference to a DBAPI2 module to be used instead of the engine's default module. For Postgres, the default is psycopg2, or psycopg1 if 2 cannot be found. For Oracle, its cx_Oracle. For mysql, MySQLdb.

use_ansi=True : used only by Oracle; when False, the Oracle driver attempts to support a particular "quirk" of some Oracle databases, that the LEFT OUTER JOIN SQL syntax is not supported, and the "Oracle join" syntax of using <column1>(+)=<column2> must be used in order to achieve a LEFT OUTER JOIN. Its advised that the Oracle database be configured to have full ANSI support instead of using this feature.

def engine_descriptors()

provides a listing of all the database implementations supported. this data is provided as a list of dictionaries, where each dictionary contains the following key/value pairs:

name : the name of the engine, suitable for use in the create_engine function

description: a plain description of the engine.

arguments : a dictionary describing the name and description of each parameter used to connect to this engine's underlying DBAPI.

This function is meant for usage in automated configuration tools that wish to query the user for database and connection information.

back to section top
Class Connectable(object)

interface for an object that can provide an Engine and a Connection object which correponds to that Engine.

def contextual_connect(self)

returns a Connection object which may be part of an ongoing context.

def create(self, entity, **kwargs)

creates a table or index given an appropriate schema object.

def drop(self, entity, **kwargs)

engine = property()

returns the Engine which this Connectable is associated with.

def execute(self, object, *multiparams, **params)

back to section top
Class ComposedSQLEngine(Engine,Connectable)

Connects a ConnectionProvider, a Dialect and a CompilerFactory together to provide a default implementation of SchemaEngine.

def __init__(self, connection_provider, dialect, echo=False, logger=None, **kwargs)

def compiler(self, statement, parameters, **kwargs)

def connect(self, **kwargs)

returns a newly allocated Connection object.

def contextual_connect(self, close_with_result=False, **kwargs)

returns a Connection object which may be newly allocated, or may be part of some ongoing context. This Connection is meant to be used by the various "auto-connecting" operations.

def create(self, entity, connection=None, **kwargs)

creates a table or index within this engine's database connection given a schema.Table object.

def dispose(self)

def drop(self, entity, connection=None, **kwargs)

drops a table or index within this engine's database connection given a schema.Table object.

engine = property()

def execute(self, statement, *multiparams, **params)

def execute_compiled(self, compiled, *multiparams, **params)

def execute_default(self, default, **kwargs)

func = property()

def has_table(self, table_name)

def log(self, msg)

logs a message using this SQLEngine's logger stream.

name = property()

def raw_connection(self)

returns a DBAPI connection.

def reflecttable(self, table, connection=None)

given a Table object, reflects its columns and properties from the database.

def run_callable(self, callable_, connection=None, *args, **kwargs)

def text(self, text, *args, **kwargs)

returns a sql.text() object for performing literal queries.

def transaction(self, callable_, connection=None, *args, **kwargs)

executes the given function within a transaction boundary. this is a shortcut for explicitly calling begin() and commit() and optionally rollback() when execptions are raised. The given *args and **kwargs will be passed to the function, as well as the Connection used in the transaction.

back to section top
Class Connection(Connectable)

represents a single DBAPI connection returned from the underlying connection pool. Provides execution support for string-based SQL statements as well as ClauseElement, Compiled and DefaultGenerator objects. provides a begin method to return Transaction objects.

def __init__(self, engine, connection=None, close_with_result=False)

def begin(self)

def close(self)

def connect(self)

connect() is implemented to return self so that an incoming Engine or Connection object can be treated similarly.

connection = property()

The underlying DBAPI connection managed by this Connection.

def contextual_connect(self, **kwargs)

contextual_connect() is implemented to return self so that an incoming Engine or Connection object can be treated similarly.

def create(self, entity, **kwargs)

creates a table or index given an appropriate schema object.

def default_schema_name(self)

def drop(self, entity, **kwargs)

drops a table or index given an appropriate schema object.

engine = property()

The Engine with which this Connection is associated (read only)

def execute(self, object, *multiparams, **params)

def execute_clauseelement(self, elem, *multiparams, **params)

def execute_compiled(self, compiled, *multiparams, **params)

executes a sql.Compiled object.

def execute_default(self, default, **kwargs)

def execute_text(self, statement, parameters=None)

def in_transaction(self)

def proxy(self, statement=None, parameters=None)

executes the given statement string and parameter object. the parameter object is expected to be the result of a call to compiled.get_params(). This callable is a generic version of a connection/cursor-specific callable that is produced within the execute_compiled method, and is used for objects that require this style of proxy when outside of an execute_compiled method, primarily the DefaultRunner.

def reflecttable(self, table, **kwargs)

reflects the columns in the given table from the database.

def run_callable(self, callable_)

def scalar(self, object, parameters=None, **kwargs)

should_close_with_result = property()

Indicates if this Connection should be closed when a corresponding ResultProxy is closed; this is essentially an auto-release mode.

back to section top
Class Transaction(object)

represents a Transaction in progress

def __init__(self, connection, parent)

def commit(self)

connection = property()

The Connection object referenced by this Transaction

is_active = property()

def rollback(self)

back to section top
Class Dialect(AbstractDialect)

Adds behavior to the execution of queries to provide support for column defaults, differences between paramstyles, quirks between post-execution behavior, and a general consistentization of the behavior of various DBAPIs.

The Dialect should also implement the following two attributes:

positional - True if the paramstyle for this Dialect is positional

paramstyle - the paramstyle to be used (some DBAPIs support multiple paramstyles)

supports_autoclose_results - usually True; if False, indicates that rows returned by fetchone() might not be just plain tuples, and may be "live" proxy objects which still require the cursor to be open in order to be read (such as pyPgSQL which has active filehandles for BLOBs). in that case, an auto-closing ResultProxy cannot automatically close itself after results are consumed.

convert_unicode - True if unicode conversion should be applied to all str types

encoding - type of encoding to use for unicode, usually defaults to 'utf-8'

def compiler(self, statement, parameters)

returns a sql.ClauseVisitor which will produce a string representation of the given ClauseElement and parameter dictionary. This object is usually a subclass of ansisql.ANSICompiler.

compiler is called within the context of the compile() method.

def convert_compiled_params(self, parameters)

given a sql.ClauseParameters object, returns an array or dictionary suitable to pass directly to this Dialect's DBAPI's execute method.

def create_connect_args(self, opts)

given a dictionary of key-valued connect parameters, returns a tuple consisting of a *args/**kwargs suitable to send directly to the dbapi's connect function. The connect args will have any number of the following keynames: host, hostname, database, dbanme, user,username, password, pw, passwd, filename.

def dbapi(self)

subclasses override this method to provide the DBAPI module used to establish connections.

def defaultrunner(self, engine, proxy, **params)

returns a schema.SchemaVisitor instances that can execute defaults.

def do_begin(self, connection)

provides an implementation of connection.begin()

def do_commit(self, connection)

provides an implementation of connection.commit()

def do_execute(self, cursor, statement, parameters)

def do_executemany(self, cursor, statement, parameters)

def do_rollback(self, connection)

provides an implementation of connection.rollback()

def execution_context(self)

returns a new ExecutionContext object.

def get_default_schema_name(self, connection)

returns the currently selected schema given an connection

def has_table(self, connection, table_name)

def oid_column_name(self)

returns the oid column name for this dialect, or None if the dialect cant/wont support OID/ROWID.

def reflecttable(self, connection, table)

given an Connection and a Table object, reflects its columns and properties from the database.

def schemadropper(self, engine, proxy, **params)

returns a schema.SchemaVisitor instance that can drop schemas, when it is invoked to traverse a set of schema objects.

schemagenerator is called via the drop() method on Table, Index, and others.

def schemagenerator(self, engine, proxy, **params)

returns a schema.SchemaVisitor instance that can generate schemas, when it is invoked to traverse a set of schema objects.

schemagenerator is called via the create() method on Table, Index, and others.

def supports_sane_rowcount(self)

Provided to indicate when MySQL is being used, which does not have standard behavior for the "rowcount" function on a statement handle.

def type_descriptor(self, typeobj)

provides a database-specific TypeEngine object, given the generic object which comes from the types module. Subclasses will usually use the adapt_type() method in the types module to make this job easy.

back to section top
Class ConnectionProvider(object)

defines an interface that returns raw Connection objects (or compatible).

def dispose(self)

releases all resources corresponding to this ConnectionProvider, such as any underlying connection pools.

def get_connection(self)

this method should return a Connection or compatible object from a DBAPI which also contains a close() method. It is not defined what context this connection belongs to. It may be newly connected, returned from a pool, part of some other kind of context such as thread-local, or can be a fixed member of this object.

back to section top
Class ExecutionContext(object)

a messenger object for a Dialect that corresponds to a single execution. The Dialect should provide an ExecutionContext via the create_execution_context() method. The pre_exec and post_exec methods will be called for compiled statements, afterwhich it is expected that the various methods last_inserted_ids, last_inserted_params, etc. will contain appropriate values, if applicable.

def get_rowcount(self, cursor)

returns the count of rows updated/deleted for an UPDATE/DELETE statement

def last_inserted_ids(self)

returns the list of the primary key values for the last insert statement executed. This does not apply to straight textual clauses; only to sql.Insert objects compiled against a schema.Table object, which are executed via statement.execute(). The order of items in the list is the same as that of the Table's 'primary_key' attribute.

In some cases, this method may invoke a query back to the database to retrieve the data, based on the "lastrowid" value in the cursor.

def last_inserted_params(self)

returns a dictionary of the full parameter dictionary for the last compiled INSERT statement, including any ColumnDefaults or Sequences that were pre-executed. this value is thread-local.

def last_updated_params(self)

returns a dictionary of the full parameter dictionary for the last compiled UPDATE statement, including any ColumnDefaults that were pre-executed. this value is thread-local.

def lastrow_has_defaults(self)

returns True if the last row INSERTED via a compiled insert statement contained PassiveDefaults, indicating that the database inserted data beyond that which we gave it. this value is thread-local.

def post_exec(self, engine, proxy, compiled, parameters)

called after the execution of a compiled statement. proxy is a callable that takes a string statement and a bind parameter list/dictionary.

def pre_exec(self, engine, proxy, compiled, parameters)

called before an execution of a compiled statement. proxy is a callable that takes a string statement and a bind parameter list/dictionary.

def supports_sane_rowcount(self)

Provided to indicate when MySQL is being used, which does not have standard behavior for the "rowcount" function on a statement handle.

back to section top
Class ResultProxy

wraps a DBAPI cursor object to provide access to row columns based on integer position, case-insensitive column name, or by schema.Column object. e.g.:

row = fetchone()

col1 = row[0] # access via integer position

col2 = row['col2'] # access via name

col3 = row[mytable.c.mycol] # access via Column object.

ResultProxy also contains a map of TypeEngine objects and will invoke the appropriate convert_result_value() method before returning columns.

def __init__(self, engine, connection, cursor, executioncontext=None, typemap=None)

ResultProxy objects are constructed via the execute() method on SQLEngine.

def close(self)

def fetchall(self)

fetches all rows, just like DBAPI cursor.fetchall().

def fetchone(self)

fetches one row, just like DBAPI cursor.fetchone().

def last_inserted_ids(self)

def last_inserted_params(self)

def last_updated_params(self)

def lastrow_has_defaults(self)

def supports_sane_rowcount(self)

back to section top
Class RowProxy

proxies a single cursor row for a parent ResultProxy.

def __init__(self, parent, row)

RowProxy objects are constructed by ResultProxy objects.

def close(self)

def has_key(self, key)

def items(self)

def keys(self)

def values(self)

back to section top
Module sqlalchemy.engine.strategies

defines different strategies for creating new instances of sql.Engine. by default there are two, one which is the "thread-local" strategy, one which is the "plain" strategy. new strategies can be added via constructing a new EngineStrategy object which will add itself to the list of available strategies here, or replace one of the existing name. this can be accomplished via a mod; see the sqlalchemy/mods package for details.

Class EngineStrategy(object)

defines a function that receives input arguments and produces an instance of sql.Engine, typically an instance sqlalchemy.engine.base.ComposedSQLEngine or a subclass.

def __init__(self, name)

constructs a new EngineStrategy object and sets it in the list of available strategies under this name.

def create(self, *args, **kwargs)

given arguments, returns a new sql.Engine instance.

back to section top
Class PlainEngineStrategy(EngineStrategy)

def __init__(self)

def create(self, name_or_url, **kwargs)

back to section top
Class ThreadLocalEngineStrategy(EngineStrategy)

def __init__(self)

def create(self, name_or_url, **kwargs)

back to section top
Module sqlalchemy.orm

the mapper package provides object-relational functionality, building upon the schema and sql packages and tying operations to class properties and constructors.

Module Functions
def backref(name, **kwargs)

def cascade_mappers(*classes_or_mappers)

given a list of classes and/or mappers, identifies the foreign key relationships between the given mappers or corresponding class mappers, and creates relation() objects representing those relationships, including a backreference. Attempts to find the "secondary" table in a many-to-many relationship as well. The names of the relations will be a lowercase version of the related class. In the case of one-to-many or many-to-many, the name will be "pluralized", which currently is based on the English language (i.e. an 's' or 'es' added to it).

def class_mapper(class_, entity_name=None)

given a ClassKey, returns the primary Mapper associated with the key.

def clear_mappers()

removes all mappers that have been created thus far. when new mappers are created, they will be assigned to their classes as their primary mapper.

def defer(name, **kwargs)

returns a MapperOption that will convert the column property of the given name into a deferred load. Used with mapper.options()

def deferred(*columns, **kwargs)

returns a DeferredColumnProperty, which indicates this object attributes should only be loaded from its corresponding table column when first accessed.

def eagerload(name, **kwargs)

returns a MapperOption that will convert the property of the given name into an eager load. Used with mapper.options()

def extension(ext)

returns a MapperOption that will add the given MapperExtension to the mapper returned by mapper.options().

def lazyload(name, **kwargs)

returns a MapperOption that will convert the property of the given name into a lazy load. Used with mapper.options()

def mapper(class_, table=None, *args, **params)

returns a newMapper object.

def noload(name, **kwargs)

returns a MapperOption that will convert the property of the given name into a non-load. Used with mapper.options()

def object_mapper(object, raiseerror=True)

given an object, returns the primary Mapper associated with the object instance

def polymorphic_union(table_map, typecolname, aliasname='p_union')

def relation(*args, **kwargs)

provides a relationship of a primary Mapper to a secondary Mapper, which corresponds to a parent-child or associative table relationship.

def undefer(name, **kwargs)

returns a MapperOption that will convert the column property of the given name into a non-deferred (regular column) load. Used with mapper.options.

back to section top
Class Mapper(object)

Persists object instances to and from schema.Table objects via the sql package. Instances of this class should be constructed through this package's mapper() or relation() function.

def __init__(self, class_, local_table, properties=None, primary_key=None, is_primary=False, non_primary=False, inherits=None, inherit_condition=None, extension=None, order_by=False, allow_column_override=False, entity_name=None, always_refresh=False, version_id_col=None, polymorphic_on=None, polymorphic_map=None, polymorphic_identity=None, concrete=False, select_table=None)

def add_polymorphic_mapping(self, key, class_or_mapper, entity_name=None)

def add_properties(self, dict_of_properties)

adds the given dictionary of properties to this mapper, using add_property.

def add_property(self, key, prop, init=True, skipmissing=False)

adds an additional property to this mapper. this is the same as if it were specified within the 'properties' argument to the constructor. if the named property already exists, this will replace it. Useful for circular relationships, or overriding the parameters of auto-generated properties such as backreferences.

def base_mapper(self)

returns the ultimate base mapper in an inheritance chain

def cascade_callable(self, type, object, callable_, recursive=None)

def cascade_iterator(self, type, object, callable_=None, recursive=None)

def compile(self, whereclause=None, **options)

deprecated. use Query instead.

def copy(self, **kwargs)

def count(self, whereclause=None, params=None, **kwargs)

deprecated. use Query instead.

def count_by(self, *args, **params)

deprecated. use Query instead.

def delete_obj(self, objects, uow)

called by a UnitOfWork object to delete objects, which involves a DELETE statement for each table used by this mapper, for each object in the list.

def get(self, ident, **kwargs)

deprecated. use Query instead.

def get_by(self, *args, **params)

deprecated. use Query instead.

def get_select_mapper(self)

def get_session(self)

returns the contextual session provided by the mapper extension chain

raises InvalidRequestError if a session cannot be retrieved from the extension chain

def has_eager(self)

returns True if one of the properties attached to this Mapper is eager loading

def identity(self, instance)

returns the identity (list of primary key values) for the given instance. The list of values can be fed directly into the get() method as mapper.get(*key).

def identity_key(self, primary_key)

returns the instance key for the given identity value. this is a global tracking object used by the Session, and is usually available off a mapped object as instance._instance_key.

def instance_key(self, instance)

returns the instance key for the given instance. this is a global tracking object used by the Session, and is usually available off a mapped object as instance._instance_key.

def instances(self, cursor, session, *mappers, **kwargs)

given a cursor (ResultProxy) from an SQLEngine, returns a list of object instances corresponding to the rows in the cursor.

def is_assigned(self, instance)

returns True if this mapper handles the given instance. this is dependent not only on class assignment but the optional "entity_name" parameter as well.

def options(self, *options, **kwargs)

uses this mapper as a prototype for a new mapper with different behavior. *options is a list of options directives, which include eagerload(), lazyload(), and noload()

def populate_instance(self, session, instance, row, identitykey, imap, isnew, frommapper=None)

def primary_mapper(self)

returns the primary mapper corresponding to this mapper's class key (class + entity_name)

def query(self, session=None)

deprecated. use Query instead.

def register_dependencies(self, uowcommit, *args, **kwargs)

called by an instance of unitofwork.UOWTransaction to register which mappers are dependent on which, as well as DependencyProcessor objects which will process lists of objects in between saves and deletes.

def save_obj(self, objects, uow, postupdate=False)

called by a UnitOfWork object to save objects, which involves either an INSERT or an UPDATE statement for each table used by this mapper, for each element of the list.

def select(self, arg=None, **kwargs)

deprecated. use Query instead.

def select_by(self, *args, **params)

deprecated. use Query instead.

def select_statement(self, statement, **params)

deprecated. use Query instead.

def select_text(self, text, **params)

deprecated. use Query instead.

def select_whereclause(self, whereclause=None, params=None, **kwargs)

deprecated. use Query instead.

def selectfirst(self, *args, **params)

deprecated. use Query instead.

def selectfirst_by(self, *args, **params)

deprecated. use Query instead.

def selectone(self, *args, **params)

deprecated. use Query instead.

def selectone_by(self, *args, **params)

deprecated. use Query instead.

def set_property(self, key, prop)

def translate_row(self, tomapper, row)

attempts to take a row and translate its values to a row that can be understood by another mapper.

def using(self, session)

deprecated. use Query instead.

back to section top
Class MapperExtension(object)

def __init__(self)

def after_delete(self, mapper, connection, instance)

called after an object instance is DELETEed

def after_insert(self, mapper, connection, instance)

called after an object instance has been INSERTed

def after_update(self, mapper, connection, instance)

called after an object instnace is UPDATED

def append_result(self, mapper, session, row, imap, result, instance, isnew, populate_existing=False)

called when an object instance is being appended to a result list.

If this method returns True, it is assumed that the mapper should do the appending, else if this method returns False, it is assumed that the append was handled by this method.

mapper - the mapper doing the operation

row - the result row from the database

imap - a dictionary that is storing the running set of objects collected from the current result set

result - an instance of util.HistoryArraySet(), which may be an attribute on an object if this is a related object load (lazy or eager). use result.append_nohistory(value) to append objects to this list.

instance - the object instance to be appended to the result

isnew - indicates if this is the first time we have seen this object instance in the current result set. if you are selecting from a join, such as an eager load, you might see the same object instance many times in the same result set.

populate_existing - usually False, indicates if object instances that were already in the main identity map, i.e. were loaded by a previous select(), get their attributes overwritten

def before_delete(self, mapper, connection, instance)

called before an object instance is DELETEed

def before_insert(self, mapper, connection, instance)

called before an object instance is INSERTed into its table.

this is a good place to set up primary key values and such that arent handled otherwise.

def before_update(self, mapper, connection, instance)

called before an object instnace is UPDATED

def chain(self, ext)

def create_instance(self, mapper, session, row, imap, class_)

called when a new object instance is about to be created from a row. the method can choose to create the instance itself, or it can return None to indicate normal object creation should take place.

mapper - the mapper doing the operation

row - the result row from the database

imap - a dictionary that is storing the running set of objects collected from the current result set

class_ - the class we are mapping.

def get_session(self)

called to retrieve a contextual Session instance with which to register a new object. Note: this is not called if a session is provided with the __init__ params (i.e. _sa_session)

def populate_instance(self, mapper, session, instance, row, identitykey, imap, isnew)

called right before the mapper, after creating an instance from a row, passes the row to its MapperProperty objects which are responsible for populating the object's attributes. If this method returns True, it is assumed that the mapper should do the appending, else if this method returns False, it is assumed that the append was handled by this method.

Essentially, this method is used to have a different mapper populate the object:

def populate_instance(self, mapper, session, instance, row, identitykey, imap, isnew): othermapper.populate_instance(session, instance, row, identitykey, imap, isnew, frommapper=mapper) return True

def select(self, query, *args, **kwargs)

overrides the select method of the Query object

def select_by(self, query, *args, **kwargs)

overrides the select_by method of the Query object

back to section top
Module sqlalchemy.orm.query

Class Query(object)

encapsulates the object-fetching operations provided by Mappers.

def __init__(self, class_or_mapper, session=None, entity_name=None, **kwargs)

def count(self, whereclause=None, params=None, **kwargs)

def count_by(self, *args, **params)

returns the count of instances based on the given clauses and key/value criterion. The criterion is constructed in the same way as the select_by() method.

def get(self, ident, **kwargs)

returns an instance of the object based on the given identifier, or None if not found. The ident argument is a scalar or tuple of primary key column values in the order of the table def's primary key columns.

def get_by(self, *args, **params)

returns a single object instance based on the given key/value criterion. this is either the first value in the result list, or None if the list is empty.

the keys are mapped to property or column names mapped by this mapper's Table, and the values are coerced into a WHERE clause separated by AND operators. If the local property/column names dont contain the key, a search will be performed against this mapper's immediate list of relations as well, forming the appropriate join conditions if a matching property is located.

e.g. u = usermapper.get_by(user_name = 'fred')

def instances(self, clauseelement, params=None, *args, **kwargs)

def join_by(self, *args, **params)

like select_by, but returns a ClauseElement representing the WHERE clause that would normally be sent to select_whereclause by select_by.

def join_to(self, key)

given the key name of a property, will recursively descend through all child properties from this Query's mapper to locate the property, and will return a ClauseElement representing a join from this Query's mapper to the endmost mapper.

def join_via(self, keys)

given a list of keys that represents a path from this Query's mapper to a related mapper based on names of relations from one mapper to the next, returns a ClauseElement representing a join from this Query's mapper to the endmost mapper.

def load(self, ident, **kwargs)

returns an instance of the object based on the given identifier. If not found, raises an exception. The method will *remove all pending changes* to the object already existing in the Session. The ident argument is a scalar or tuple of primary key column values in the order of the table def's primary key columns.

def options(self, *args, **kwargs)

returns a new Query object using the given MapperOptions.

def select(self, arg=None, **kwargs)

selects instances of the object from the database.

arg can be any ClauseElement, which will form the criterion with which to load the objects.

For more advanced usage, arg can also be a Select statement object, which will be executed and its resulting rowset used to build new object instances. in this case, the developer must insure that an adequate set of columns exists in the rowset with which to build new object instances.

def select_by(self, *args, **params)

returns an array of object instances based on the given clauses and key/value criterion.

*args is a list of zero or more ClauseElements which will be connected by AND operators.

**params is a set of zero or more key/value parameters which are converted into ClauseElements. the keys are mapped to property or column names mapped by this mapper's Table, and the values are coerced into a WHERE clause separated by AND operators. If the local property/column names dont contain the key, a search will be performed against this mapper's immediate list of relations as well, forming the appropriate join conditions if a matching property is located.

e.g. result = usermapper.select_by(user_name = 'fred')

def select_statement(self, statement, **params)

def select_text(self, text, **params)

def select_whereclause(self, whereclause=None, params=None, **kwargs)

def selectfirst(self, *args, **params)

works like select(), but only returns the first result by itself, or None if no objects returned.

def selectfirst_by(self, *args, **params)

works like select_by(), but only returns the first result by itself, or None if no objects returned. Synonymous with get_by()

def selectone(self, *args, **params)

works like selectfirst(), but throws an error if not exactly one result was returned.

def selectone_by(self, *args, **params)

works like selectfirst_by(), but throws an error if not exactly one result was returned.

session = property()

table = property()

back to section top
Module sqlalchemy.orm.session

Module Functions
def class_mapper(class_, **kwargs)

def get_id_key(ident, class_, entity_name=None)

def get_row_key(row, class_, primary_key, entity_name=None)

def get_session(obj=None)

deprecated

def object_mapper(obj)

def object_session(obj)

back to section top
Class Session(object)

encapsulates a set of objects being operated upon within an object-relational operation.

def __init__(self, bind_to=None, hash_key=None, import_session=None, echo_uow=False)

def begin(self, *obj)

deprecated

def bind_mapper(self, mapper, bindto)

binds the given Mapper to the given Engine or Connection. All subsequent operations involving this Mapper will use the given bindto.

def bind_table(self, table, bindto)

binds the given Table to the given Engine or Connection. All subsequent operations involving this Table will use the given bindto.

def clear(self)

removes all object instances from this Session. this is equivalent to calling expunge() for all objects in this Session.

def close(self)

closes this Session.

def commit(self, *obj)

deprecated

def connect(self, mapper=None, **kwargs)

returns a unique connection corresponding to the given mapper. this connection will not be part of any pre-existing transactional context.

def connection(self, mapper, **kwargs)

returns a Connection corresponding to the given mapper. used by the execute() method which performs select operations for Mapper and Query. if this Session is transactional, the connection will be in the context of this session's transaction. otherwise, the connection is returned by the contextual_connect method, which some Engines override to return a thread-local connection, and will have close_with_result set to True.

the given **kwargs will be sent to the engine's contextual_connect() method, if no transaction is in progress.

def create_transaction(self, **kwargs)

returns a new SessionTransaction corresponding to an existing or new transaction. if the transaction is new, the returned SessionTransaction will have commit control over the underlying transaction, else will have rollback control only.

def delete(self, object, entity_name=None)

deleted = property()

a Set of all objects marked as 'deleted' within this Session

dirty = property()

a Set of all objects marked as 'dirty' within this Session

def execute(self, mapper, clause, params, **kwargs)

using the given mapper to identify the appropriate Engine or Connection to be used for statement execution, executes the given ClauseElement using the provided parameter dictionary. Returns a ResultProxy corresponding to the execution's results. If this method allocates a new Connection for the operation, then the ResultProxy's close() method will release the resources of the underlying Connection, otherwise its a no-op.

def expire(self, object)

invalidates the data in the given object and sets them to refresh themselves the next time they are requested.

def expunge(self, object)

removes the given object from this Session. this will free all internal references to the object.

def flush(self, objects=None)

flushes all the object modifications present in this session to the database. 'objects' is a list or tuple of objects specifically to be flushed.

def get(self, class_, ident, **kwargs)

returns an instance of the object based on the given identifier, or None if not found. The ident argument is a scalar or tuple of primary key column values in the order of the table def's primary key columns.

the entity_name keyword argument may also be specified which further qualifies the underlying Mapper used to perform the query.

def get_bind(self, mapper)

given a Mapper, returns the Engine or Connection which is used to execute statements on behalf of this Mapper. Calling connect() on the return result will always result in a Connection object. This method disregards any SessionTransaction that may be in progress.

The order of searching is as follows:

if an Engine or Connection was bound to this Mapper specifically within this Session, returns that Engine or Connection.

if an Engine or Connection was bound to this Mapper's underlying Table within this Session (i.e. not to the Table directly), returns that Engine or Conneciton.

if an Engine or Connection was bound to this Session, returns that Engine or Connection.

finally, returns the Engine which was bound directly to the Table's MetaData object.

If no Engine is bound to the Table, an exception is raised.

def has_key(self, key)

identity_map = property()

a WeakValueDictionary consisting of all objects within this Session keyed to their _instance_key value.

def import_instance(self, *args, **kwargs)

deprecated; a synynom for merge()

def is_expired(self, instance, **kwargs)

def load(self, class_, ident, **kwargs)

returns an instance of the object based on the given identifier. If not found, raises an exception. The method will *remove all pending changes* to the object already existing in the Session. The ident argument is a scalar or tuple of primary key columns in the order of the table def's primary key columns.

the entity_name keyword argument may also be specified which further qualifies the underlying Mapper used to perform the query.

def mapper(self, class_, entity_name=None)

given an Class, returns the primary Mapper responsible for persisting it

def merge(self, object, entity_name=None)

new = property()

a Set of all objects marked as 'new' within this Session.

def query(self, mapper_or_class, entity_name=None)

given a mapper or Class, returns a new Query object corresponding to this Session and the mapper, or the classes' primary mapper.

def refresh(self, object)

reloads the attributes for the given object from the database, clears any changes made.

def save(self, object, entity_name=None)

Adds a transient (unsaved) instance to this Session. This operation cascades the "save_or_update" method to associated instances if the relation is mapped with cascade="save-update".

The 'entity_name' keyword argument will further qualify the specific Mapper used to handle this instance.

def save_or_update(self, object, entity_name=None)

def scalar(self, mapper, clause, params, **kwargs)

works like execute() but returns a scalar result.

sql = property()

def update(self, object, entity_name=None)

Brings the given detached (saved) instance into this Session. If there is a persistent instance with the same identifier (i.e. a saved instance already associated with this Session), an exception is thrown. This operation cascades the "save_or_update" method to associated instances if the relation is mapped with cascade="save-update".

back to section top
Class SessionTransaction(object)

def __init__(self, session, parent=None, autoflush=True)

def add(self, connectable)

def close(self)

def commit(self)

def connection(self, mapper_or_class, entity_name=None)

def get_or_add(self, connectable)

def rollback(self)

back to section top
Module sqlalchemy.pool

provides a connection pool implementation, which optionally manages connections on a thread local basis. Also provides a DBAPI2 transparency layer so that pools can be managed automatically, based on module type and connect arguments, simply by calling regular DBAPI connect() methods.

Module Functions
def clear_managers()

removes all current DBAPI2 managers. all pools and connections are disposed.

def manage(module, **params)

given a DBAPI2 module and pool management parameters, returns a proxy for the module that will automatically pool connections. Options are delivered to an underlying DBProxy object.

Arguments: module : a DBAPI2 database module.

Options: echo=False : if set to True, connections being pulled and retrieved from/to the pool will be logged to the standard output, as well as pool sizing information.

use_threadlocal=True : if set to True, repeated calls to connect() within the same application thread will be guaranteed to return the same connection object, if one has already been retrieved from the pool and has not been returned yet. This allows code to retrieve a connection from the pool, and then while still holding on to that connection, to call other functions which also ask the pool for a connection of the same arguments; those functions will act upon the same connection that the calling method is using.

poolclass=QueuePool : the default class used by the pool module to provide pooling. QueuePool uses the Python Queue.Queue class to maintain a list of available connections.

pool_size=5 : used by QueuePool - the size of the pool to be maintained. This is the largest number of connections that will be kept persistently in the pool. Note that the pool begins with no connections; once this number of connections is requested, that number of connections will remain.

max_overflow=10 : the maximum overflow size of the pool. When the number of checked-out connections reaches the size set in pool_size, additional connections will be returned up to this limit. When those additional connections are returned to the pool, they are disconnected and discarded. It follows then that the total number of simultaneous connections the pool will allow is pool_size + max_overflow, and the total number of "sleeping" connections the pool will allow is pool_size. max_overflow can be set to -1 to indicate no overflow limit; no limit will be placed on the total number of concurrent connections.

back to section top
Class DBProxy(object)

proxies a DBAPI2 connect() call to a pooled connection keyed to the specific connect parameters.

def __init__(self, module, poolclass=, **params)

module is a DBAPI2 module poolclass is a Pool class, defaulting to QueuePool. other parameters are sent to the Pool object's constructor.

def close(self)

def connect(self, *args, **params)

connects to a database using this DBProxy's module and the given connect arguments. if the arguments match an existing pool, the connection will be returned from the pool's current thread-local connection instance, or if there is no thread-local connection instance it will be checked out from the set of pooled connections. If the pool has no available connections and allows new connections to be created, a new database connection will be made.

def dispose(self, *args, **params)

disposes the connection pool referenced by the given connect arguments.

def get_pool(self, *args, **params)

back to section top
Class Pool(object)

def __init__(self, echo=False, use_threadlocal=True, logger=None)

def connect(self)

def do_get(self)

def do_return_conn(self, conn)

def do_return_invalid(self)

def get(self)

def log(self, msg)

def return_conn(self, agent)

def return_invalid(self)

def status(self)

def unique_connection(self)

back to section top
Class QueuePool(Pool)

uses Queue.Queue to maintain a fixed-size list of connections.

def __init__(self, creator, pool_size=5, max_overflow=10, timeout=30, **params)

def checkedin(self)

def checkedout(self)

def dispose(self)

def do_get(self)

def do_return_conn(self, conn)

def do_return_invalid(self)

def overflow(self)

def size(self)

def status(self)

back to section top
Class SingletonThreadPool(Pool)

Maintains one connection per each thread, never moving to another thread. this is used for SQLite.

def __init__(self, creator, **params)

def dispose(self)

def do_get(self)

def do_return_conn(self, conn)

def do_return_invalid(self)

def status(self)

back to section top
Module sqlalchemy.ext.sessioncontext

Class SessionContext(object)

A simple wrapper for ScopedRegistry that provides a "current" property which can be used to get, set, or remove the session in the current scope.

By default this object provides thread-local scoping, which is the default scope provided by sqlalchemy.util.ScopedRegistry.

Usage: engine = create_engine(...) def session_factory(): return Session(bind_to=engine) context = SessionContext(session_factory)

s = context.current # get thread-local session context.current = Session(bind_to=other_engine) # set current session del context.current # discard the thread-local session (a new one will # be created on the next call to context.current)

def __init__(self, session_factory, scopefunc=None)

current = property()

Property used to get/set/del the session in the current scope

def del_current(self)

def get_current(self)

mapper_extension = property()

get a mapper extension that implements get_session using this context

def set_current(self, session)

back to section top
Class SessionContextExt(MapperExtension)

a mapper extionsion that provides sessions to a mapper using SessionContext

def __init__(self, context)

def get_session(self)

back to section top
Module sqlalchemy.mods.threadlocal

Module Functions
def assign_mapper(class_, *args, **kwargs)

back to section top
Class Objectstore(SessionContext)

def get_session(self)

back to section top
Module sqlalchemy.ext.selectresults

Class SelectResults(object)

Builds a query one component at a time via separate method calls, each call transforming the previous SelectResults instance into a new SelectResults instance with further limiting criterion added. When interpreted in an iterator context (such as via calling list(selectresults)), executes the query.

def __init__(self, query, clause=None, ops={})

constructs a new SelectResults using the given Query object and optional WHERE clause. ops is an optional dictionary of bind parameter values.

def avg(self, col)

executes the SQL avg() function against the given column

def clone(self)

creates a copy of this SelectResults.

def count(self)

executes the SQL count() function against the SelectResults criterion.

def filter(self, clause)

applies an additional WHERE clause against the query.

def limit(self, limit)

applies a LIMIT to the query.

def list(self)

returns the results represented by this SelectResults as a list. this results in an execution of the underlying query.

def max(self, col)

executes the SQL max() function against the given column

def min(self, col)

executes the SQL min() function against the given column

def offset(self, offset)

applies an OFFSET to the query.

def order_by(self, order_by)

applies an ORDER BY to the query.

def sum(self, col)

executes the SQL sum() function against the given column

back to section top
Class SelectResultsExt(MapperExtension)

a MapperExtension that provides SelectResults functionality for the results of query.select_by() and query.select()

def select(self, query, arg=None, **kwargs)

def select_by(self, query, *args, **params)

back to section top
Module sqlalchemy.exceptions

Class ArgumentError

raised for all those conditions where invalid arguments are sent to constructed objects. This error generally corresponds to construction time state errors.

back to section top
Class AssertionError

corresponds to internal state being detected in an invalid state

back to section top
Class DBAPIError

something weird happened with a particular DBAPI version

back to section top
Class FlushError

raised when an invalid condition is detected upon a flush()

back to section top
Class InvalidRequestError

sqlalchemy was asked to do something it cant do, return nonexistent data, etc. This error generally corresponds to runtime state errors.

back to section top
Class SQLAlchemyError

generic error class

back to section top
Class SQLError

raised when the execution of a SQL statement fails. includes accessors for the underlying exception, as well as the SQL and bind parameters

def __init__(self, statement, params, orig)

back to section top
Module sqlalchemy.ext.proxy

Class AutoConnectEngine(BaseProxyEngine)

An SQLEngine proxy that automatically connects when necessary.

def __init__(self, dburi, **kwargs)

def get_engine(self)

back to section top
Class BaseProxyEngine(Engine)

Basis for all proxy engines.

def compiler(self, *args, **kwargs)

this method is required to be present as it overrides the compiler method present in sql.Engine

engine = property()

def execute_compiled(self, *args, **kwargs)

this method is required to be present as it overrides the execute_compiled present in sql.Engine

def get_engine(self)

def set_engine(self, engine)

back to section top
Class ProxyEngine(BaseProxyEngine)

Engine proxy for lazy and late initialization.

This engine will delegate access to a real engine set with connect().

def __init__(self, **kwargs)

def connect(self, *args, **kwargs)

Establish connection to a real engine.

def get_engine(self)

def set_engine(self, engine)

back to section top