电饭煲煮粥为什么会溢出来| 麦五行属什么| 烤箱可以烤些什么东西| 谨言是什么意思| 双肺纹理增多是什么意思严重吗| 眼睛老是流眼泪是什么原因| 脖子上长痘痘是什么原因| 推举是什么意思| 取环前需要做什么检查| 胃发热是什么原因| 舌尖溃疡是什么原因| 花开半夏是什么意思| 公元400年是什么朝代| 胃间质瘤是什么性质的瘤| 为什么我不快乐| 哈喽是什么意思| 小孩嘴臭是什么原因| 晚上看到黄鼠狼什么预兆| 女性排卵有什么症状或感觉| 产妇喝什么汤下奶最快最多| 减肥可以吃什么菜| 三月二十八号是什么星座| 护肝养肝吃什么药| 孕妇梦见猫是什么意思| 维生素b族什么时候吃最好| 补钾吃什么药| 云南白药植物长什么样| 早上起床口苦是什么原因| 凿壁偷光告诉我们什么道理| ms.是什么意思| 仁字五行属什么| 全光谱是什么意思| 子宫肌瘤吃什么药| 口腔异味是什么原因引起的| 女性口苦是什么原因引起的| babies是什么意思| 已是什么生肖| 旧加一笔是什么字| 在什么地方| 氯雷他定是什么药| 94属什么生肖| 姐姐的老公叫什么| 载脂蛋白b偏低是什么意思| 男人味是什么意思| 乘务长是干什么的| 马牙是什么原因引起的| 梦到老鼠是什么意思| 学考是什么意思| 经常做噩梦的原因是什么| 八婆是什么意思| 甘油三酯高吃什么食物| 7月15是什么星座| 痔疮手术后吃什么| 艾滋病皮肤有什么症状| 长情是什么意思| 滋阴是什么意思| 心脏支架是什么材料做的| 大便溏泄是什么意思| 黄茶是什么茶| 熬中药用什么锅好| 怕空调冷风什么原因| 活字印刷术是什么时候发明的| 黄色配什么颜色最搭| 大便不成形是什么原因造成的| 下午17点是什么时辰| 学海无涯苦作舟的上一句是什么| 吥是什么意思| 什么是沉没成本| 什么人不能吃人参| 朝鲜为什么闭关锁国| 脾胃不好吃什么水果好| 排卵期有什么明显症状| 荞麦和苦荞有什么区别| 靶子是什么意思| 峻字五行属什么| 胃息肉有什么危害| 血脂稠喝什么茶效果好| hpv弱阳性是什么意思| headache什么意思| 组织机构代码是什么| 皮肤瘙痒吃什么药| 补充镁有什么好处| 甲状腺球蛋白低说明什么| 减肥吃什么水果好| 吃什么祛痰化痰最有效| 少白头是什么原因| 婴儿不睡觉是什么原因| 钙片什么时间吃最好| 天秤座的幸运色是什么| 要不然是什么意思| 心电图是检查什么的| 蛛网膜囊肿挂什么科| 堃什么意思| ed是什么病| nt检查前需要注意什么| 鲜为人知什么意思| 胃胀什么原因| 女人什么时候绝经正常| 指导员是什么级别| 大便不规律是什么原因| 七夕节的含义是什么| 梦见包丢了是什么意思| 睡莲和碗莲有什么区别| 618是什么星座| 乳头瘙痒是什么原因| 晚上夜尿多吃什么药| 贬低是什么意思| 突然低血压是什么原因造成的| 女性风湿吃什么东西好| 打破伤风挂什么科| 草果长什么样| 嗜睡是什么病| 9月三号是什么日子| 1007是什么星座| 上升星座是什么意思| 入木三分是什么生肖| 舌苔白腻吃什么药| 小寒是什么意思| 精液长什么样| 脖子落枕贴什么膏药| 针眼是什么样子的图片| 优对什么| alcon是什么牌子| 什么童话| 葛根长什么样子图片| 过敏用什么药膏| 糠是什么| 脸色发红什么原因| 密度单位是什么| 鸡吃什么长得又快又肥| 很轴是什么意思| 杭州有什么好玩的地方| 10月是什么季节| 如常所愿是什么意思| 什么的水洼| 地瓜不能和什么一起吃| 微信为什么发不了视频| 芙蕖是什么意思| 籽骨出现意味着什么| 青稞面是什么| 鸟几念什么| 8月11日是什么星座| 金蝉花是什么| nary是什么牌子的手表| 平均血红蛋白含量偏低是什么意思| 五行中水是什么颜色| 舌苔发白厚吃什么药| 画蛇添足的故事告诉我们什么道理| 什么是局限性肺纤维化| 玻璃瓶属于什么垃圾| 血压有点低是什么原因| 血热吃什么药好| 生物冰袋里面是什么| 为什么一站起来就头晕眼前发黑| 牙龈肿痛吃什么药| 1987属什么生肖| 国企混改是什么意思| 什么鸡没有翅膀| 秦始皇的母亲叫什么名字| 吃播为什么吃不胖| 苦丁茶有什么作用和功效| 芝麻开花节节高是什么意思| 脖子上长扁平疣是什么原因| lemaire是什么品牌| 牛的本命佛是什么佛| 飞蛾为什么会扑火| 圆房是什么意思| 桔梗是什么东西| 什么叫meta分析| 少校是什么级别| 早搏是什么症状| 精神心理科主要治疗什么疾病| 驴打滚是什么| 肝掌是什么样子| 乳房钙化灶是什么意思| 什么座| 齁甜是什么意思| 北海有什么好玩的| electrolux是什么牌子| 吃什么 长高| 吃什么可以提升白细胞| 殁送是什么意思| 下肢浮肿是什么原因| 夏天喝什么水最好| 棕色皮鞋配什么颜色裤子| 三氯蔗糖是什么东西| 看食道挂什么科室| 鱼平念什么| 天天喝豆浆有什么好处和坏处| 海选是什么意思| 庚子五行属什么| 垂头丧气什么意思| 早上起床眼屎多是什么原因| 胃不好吃什么菜| 大寒吃什么| 女人阴部黑是什么原因| 月青念什么| 255是什么意思| 封建社会是什么意思| 什么火没有烟| 0706是什么星座| 芭蕉和香蕉有什么区别| 心慌气短吃什么药| 尿酸挂什么科| 手脚麻木是什么原因引起的| 前额白发是什么原因| 口臭吃什么药最有效| b2是什么| 背靠背什么意思| 大利月小利月什么意思| 黄褐斑是什么引起的| 血热吃什么药可以凉血| nov是什么意思| 92年什么命| 湿疹是什么| 青海省会城市叫什么| 身上长小红点是什么原因| 5月24号是什么日子| 未融资是什么意思| 消化不良用什么药| 西瓜红是什么颜色| 入肉是什么字| 射线是什么| 纯阳之人有什么特征| icu病房是什么意思| 癞蛤蟆长什么样| 为什么会突然长智齿| 徐峥的老婆叫什么名字| 蚯蚓吃什么食物| 病灶什么意思| 减肥期间晚上吃什么| 改善什么| 一百岁叫什么之年| 梦见蝎子是什么意思| 吃什么爱放屁| 什么样的人招蚊子| 有张有弛是什么意思| 清真不能吃什么| 肚脐周围疼是什么原因| 做喉镜能检查出什么病| 宫颈机能不全是什么意思| 口臭吃什么好| 吃什么增强抵抗力| 为什么会长疱疹| 前胸后背长痘痘用什么药| 鸡心为什么不建议吃| 六月五行属什么| 孩子鼻塞吃什么药| 特应性皮炎用什么药膏| 木变石是什么| 喜欢趴着睡是什么原因| 梦见别人理发是什么意思| 苏东坡属什么生肖| 内痔是什么样的图片| 阴道什么样| 985和211有什么区别| 梦到下雨是什么意思| 小便尿色黄是什么问题| 嘴唇紫黑是什么原因| 疟疾是什么| 什么情况要打破伤风针| 胃疼是什么症状| 受委屈是什么意思| 肺部肿瘤切除后吃什么| 百度
Small. Fast. Reliable.
Choose any three.

This document was originally created in early 2004 when SQLite version 2 was still in widespread use and was written to introduce the new concepts of SQLite version 3 to readers who were already familiar with SQLite version 2. But these days, most readers of this document have probably never seen SQLite version 2 and are only familiar with SQLite version 3. Nevertheless, this document continues to serve as an authoritative reference to how database file locking works in SQLite version 3.

The document only describes locking for the older rollback-mode transaction mechanism. Locking for the newer write-ahead log or WAL mode is described separately.

1.0 File Locking And Concurrency In SQLite Version 3

SQLite Version 3.0.0 introduced a new locking and journaling mechanism designed to improve concurrency over SQLite version 2 and to reduce the writer starvation problem. The new mechanism also allows atomic commits of transactions involving multiple database files. This document describes the new locking mechanism. The intended audience is programmers who want to understand and/or modify the pager code and reviewers working to verify the design of SQLite version 3.

2.0 Overview

Locking and concurrency control are handled by the pager module. The pager module is responsible for making SQLite "ACID" (Atomic, Consistent, Isolated, and Durable). The pager module makes sure changes happen all at once, that either all changes occur or none of them do, that two or more processes do not try to access the database in incompatible ways at the same time, and that once changes have been written they persist until explicitly deleted. The pager also provides a memory cache of some of the contents of the disk file.

The pager is unconcerned with the details of B-Trees, text encodings, indices, and so forth. From the point of view of the pager the database consists of a single file of uniform-sized blocks. Each block is called a "page" and is usually 1024 bytes in size. The pages are numbered beginning with 1. So the first 1024 bytes of the database are called "page 1" and the second 1024 bytes are call "page 2" and so forth. All other encoding details are handled by higher layers of the library. The pager communicates with the operating system using one of several modules (Examples: os_unix.c, os_win.c) that provides a uniform abstraction for operating system services.

The pager module effectively controls access for separate threads, or separate processes, or both. Throughout this document whenever the word "process" is written you may substitute the word "thread" without changing the truth of the statement.

3.0 Locking

From the point of view of a single process, a database file can be in one of five locking states:

UNLOCKED No locks are held on the database. The database may be neither read nor written. Any internally cached data is considered suspect and subject to verification against the database file before being used. Other processes can read or write the database as their own locking states permit. This is the default state.
SHARED The database may be read but not written. Any number of processes can hold SHARED locks at the same time, hence there can be many simultaneous readers. But no other thread or process is allowed to write to the database file while one or more SHARED locks are active.
RESERVED A RESERVED lock means that the process is planning on writing to the database file at some point in the future but that it is currently just reading from the file. Only a single RESERVED lock may be active at one time, though multiple SHARED locks can coexist with a single RESERVED lock. RESERVED differs from PENDING in that new SHARED locks can be acquired while there is a RESERVED lock.
PENDING A PENDING lock means that the process holding the lock wants to write to the database as soon as possible and is just waiting on all current SHARED locks to clear so that it can get an EXCLUSIVE lock. No new SHARED locks are permitted against the database if a PENDING lock is active, though existing SHARED locks are allowed to continue.
EXCLUSIVE An EXCLUSIVE lock is needed in order to write to the database file. Only one EXCLUSIVE lock is allowed on the file and no other locks of any kind are allowed to coexist with an EXCLUSIVE lock. In order to maximize concurrency, SQLite works to minimize the amount of time that EXCLUSIVE locks are held.

The operating system interface layer understands and tracks all five locking states described above. The pager module only tracks four of the five locking states. A PENDING lock is always just a temporary stepping stone on the path to an EXCLUSIVE lock and so the pager module does not track PENDING locks.

4.0 The Rollback Journal

When a process wants to change a database file (and it is not in WAL mode), it first records the original unchanged database content in a rollback journal. The rollback journal is an ordinary disk file that is always located in the same directory or folder as the database file and has the same name as the database file with the addition of a -journal suffix. The rollback journal also records the initial size of the database so that if the database file grows it can be truncated back to its original size on a rollback.

If SQLite is working with multiple databases at the same time (using the ATTACH command) then each database has its own rollback journal. But there is also a separate aggregate journal called the super-journal. The super-journal does not contain page data used for rolling back changes. Instead the super-journal contains the names of the individual database rollback journals for each of the ATTACHed databases. Each of the individual database rollback journals also contain the name of the super-journal. If there are no ATTACHed databases (or if none of the ATTACHed database is participating in the current transaction) no super-journal is created and the normal rollback journal contains an empty string in the place normally reserved for recording the name of the super-journal.

A rollback journal is said to be hot if it needs to be rolled back in order to restore the integrity of its database. A hot journal is created when a process is in the middle of a database update and a program or operating system crash or power failure prevents the update from completing. Hot journals are an exception condition. Hot journals exist to recover from crashes and power failures. If everything is working correctly (that is, if there are no crashes or power failures) you will never get a hot journal.

If no super-journal is involved, then a journal is hot if it exists and has a non-zero header and its corresponding database file does not have a RESERVED lock. If a super-journal is named in the file journal, then the file journal is hot if its super-journal exists and there is no RESERVED lock on the corresponding database file. It is important to understand when a journal is hot so the preceding rules will be repeated in bullets:

4.1 Dealing with hot journals

Before reading from a database file, SQLite always checks to see if that database file has a hot journal. If the file does have a hot journal, then the journal is rolled back before the file is read. In this way, we ensure that the database file is in a consistent state before it is read.

When a process wants to read from a database file, it follows the following sequence of steps:

  1. Open the database file and obtain a SHARED lock. If the SHARED lock cannot be obtained, fail immediately and return SQLITE_BUSY.
  2. Check to see if the database file has a hot journal. If the file does not have a hot journal, we are done. Return immediately. If there is a hot journal, that journal must be rolled back by the subsequent steps of this algorithm.
  3. Acquire a PENDING lock then an EXCLUSIVE lock on the database file. (Note: Do not acquire a RESERVED lock because that would make other processes think the journal was no longer hot.) If we fail to acquire these locks it means another process is already trying to do the rollback. In that case, drop all locks, close the database, and return SQLITE_BUSY.
  4. Read the journal file and roll back the changes.
  5. Wait for the rolled back changes to be written onto persistent storage. This protects the integrity of the database in case another power failure or crash occurs.
  6. Delete the journal file (or truncate the journal to zero bytes in length if PRAGMA journal_mode=TRUNCATE is set, or zero the journal header if PRAGMA journal_mode=PERSIST is set).
  7. Delete the super-journal file if it is safe to do so. This step is optional. It is here only to prevent stale super-journals from cluttering up the disk drive. See the discussion below for details.
  8. Drop the EXCLUSIVE and PENDING locks but retain the SHARED lock.

After the algorithm above completes successfully, it is safe to read from the database file. Once all reading has completed, the SHARED lock is dropped.

4.2 Deleting stale super-journals

A stale super-journal is a super-journal that is no longer being used for anything. There is no requirement that stale super-journals be deleted. The only reason for doing so is to free up disk space.

A super-journal is stale if no individual file journals are pointing to it. To figure out if a super-journal is stale, we first read the super-journal to obtain the names of all of its file journals. Then we check each of those file journals. If any of the file journals named in the super-journal exists and points back to the super-journal, then the super-journal is not stale. If all file journals are either missing or refer to other super-journals or no super-journal at all, then the super-journal we are testing is stale and can be safely deleted.

5.0 Writing to a database file

To write to a database, a process must first acquire a SHARED lock as described above (possibly rolling back incomplete changes if there is a hot journal). After a SHARED lock is obtained, a RESERVED lock must be acquired. The RESERVED lock signals that the process intends to write to the database at some point in the future. Only one process at a time can hold a RESERVED lock. But other processes can continue to read the database while the RESERVED lock is held.

If the process that wants to write is unable to obtain a RESERVED lock, it must mean that another process already has a RESERVED lock. In that case, the write attempt fails and returns SQLITE_BUSY.

After obtaining a RESERVED lock, the process that wants to write creates a rollback journal. The header of the journal is initialized with the original size of the database file. Space in the journal header is also reserved for a super-journal name, though the super-journal name is initially empty.

Before making changes to any page of the database, the process writes the original content of that page into the rollback journal. Changes to pages are held in memory at first and are not written to the disk. The original database file remains unaltered, which means that other processes can continue to read the database.

Eventually, the writing process will want to update the database file, either because its memory cache has filled up or because it is ready to commit its changes. Before this happens, the writer must make sure no other process is reading the database and that the rollback journal data is safely on the disk surface so that it can be used to rollback incomplete changes in the event of a power failure. The steps are as follows:

  1. Make sure all rollback journal data has actually been written to the surface of the disk (and is not just being held in the operating system's or disk controllers cache) so that if a power failure occurs the data will still be there after power is restored.
  2. Obtain a PENDING lock and then an EXCLUSIVE lock on the database file. If other processes still have SHARED locks, the writer might have to wait until those SHARED locks clear before it is able to obtain an EXCLUSIVE lock.
  3. Write all page modifications currently held in memory out to the original database disk file.

If the reason for writing to the database file is because the memory cache was full, then the writer will not commit right away. Instead, the writer might continue to make changes to other pages. Before subsequent changes are written to the database file, the rollback journal must be flushed to disk again. Note also that the EXCLUSIVE lock that the writer obtained in order to write to the database initially must be held until all changes are committed. That means that no other processes are able to access the database from the time the memory cache first spills to disk until the transaction commits.

When a writer is ready to commit its changes, it executes the following steps:

  1. Obtain an EXCLUSIVE lock on the database file and make sure all memory changes have been written to the database file using the algorithm of steps 1-3 above.
  2. Flush all database file changes to the disk. Wait for those changes to actually be written onto the disk surface.
  3. Delete the journal file. (Or if the PRAGMA journal_mode is TRUNCATE or PERSIST, truncate the journal file or zero the header of the journal file, respectively.) This is the instant when the changes are committed. Prior to deleting the journal file, if a power failure or crash occurs, the next process to open the database will see that it has a hot journal and will roll the changes back. After the journal is deleted, there will no longer be a hot journal and the changes will persist.
  4. Drop the EXCLUSIVE and PENDING locks from the database file.

As soon as the PENDING lock is released from the database file, other processes can begin reading the database again. In the current implementation, the RESERVED lock is also released, but that is not essential for correct operation.

If a transaction involves multiple databases, then a more complex commit sequence is used, as follows:

  1. Make sure all individual database files have an EXCLUSIVE lock and a valid journal.
  2. Create a super-journal. The name of the super-journal is arbitrary. (The current implementation appends random suffixes to the name of the main database file until it finds a name that does not previously exist.) Fill the super-journal with the names of all the individual journals and flush its contents to disk.
  3. Write the name of the super-journal into all individual journals (in space set aside for that purpose in the headers of the individual journals) and flush the contents of the individual journals to disk and wait for those changes to reach the disk surface.
  4. Flush all database file changes to the disk. Wait for those changes to actually be written onto the disk surface.
  5. Delete the super-journal file. This is the instant when the changes are committed. Prior to deleting the super-journal file, if a power failure or crash occurs, the individual file journals will be considered hot and will be rolled back by the next process that attempts to read them. After the super-journal has been deleted, the file journals will no longer be considered hot and the changes will persist.
  6. Delete all individual journal files.
  7. Drop the EXCLUSIVE and PENDING locks from all database files.

5.1 Writer starvation

In SQLite version 2, if many processes are reading from the database, it might be the case that there is never a time when there are no active readers. And if there is always at least one read lock on the database, no process would ever be able to make changes to the database because it would be impossible to acquire a write lock. This situation is called writer starvation.

SQLite version 3 seeks to avoid writer starvation through the use of the PENDING lock. The PENDING lock allows existing readers to continue but prevents new readers from connecting to the database. So when a process wants to write a busy database, it can set a PENDING lock which will prevent new readers from coming in. Assuming existing readers do eventually complete, all SHARED locks will eventually clear and the writer will be given a chance to make its changes.

6.0 How To Corrupt Your Database Files

The pager module is very robust but it can be subverted. This section attempts to identify and explain the risks. (See also the Things That Can Go Wrong section of the article on Atomic Commit.

Clearly, a hardware or operating system fault that introduces incorrect data into the middle of the database file or journal will cause problems. Likewise, if a rogue process opens a database file or journal and writes malformed data into the middle of it, then the database will become corrupt. There is not much that can be done about these kinds of problems so they are given no further attention.

SQLite uses POSIX advisory locks to implement locking on Unix. On Windows it uses the LockFile(), LockFileEx(), and UnlockFile() system calls. SQLite assumes that these system calls all work as advertised. If that is not the case, then database corruption can result. One should note that POSIX advisory locking is known to be buggy or even unimplemented on many NFS implementations (including recent versions of Mac OS X) and that there are reports of locking problems for network filesystems under Windows. Your best defense is to not use SQLite for files on a network filesystem.

SQLite uses the fsync() system call to flush data to the disk under Unix and it uses the FlushFileBuffers() to do the same under Windows. Once again, SQLite assumes that these operating system services function as advertised. But it has been reported that fsync() and FlushFileBuffers() do not always work correctly, especially with some network filesystems or inexpensive IDE disks. Apparently some manufacturers of IDE disks have controller chips that report that data has reached the disk surface when in fact the data is still in volatile cache memory in the disk drive electronics. There are also reports that Windows sometimes chooses to ignore FlushFileBuffers() for unspecified reasons. The author cannot verify any of these reports. But if they are true, it means that database corruption is a possibility following an unexpected power loss. These are hardware and/or operating system bugs that SQLite is unable to defend against.

If a Linux ext3 filesystem is mounted without the "barrier=1" option in the /etc/fstab and the disk drive write cache is enabled then filesystem corruption can occur following a power loss or OS crash. Whether or not corruption can occur depends on the details of the disk control hardware; corruption is more likely with inexpensive consumer-grade disks and less of a problem for enterprise-class storage devices with advanced features such as non-volatile write caches. Various ext3 experts confirm this behavior. We are told that most Linux distributions do not use barrier=1 and do not disable the write cache so most Linux distributions are vulnerable to this problem. Note that this is an operating system and hardware issue and that there is nothing that SQLite can do to work around it. Other database engines have also run into this same problem.

If a crash or power failure occurs and results in a hot journal but that journal is deleted, the next process to open the database will not know that it contains changes that need to be rolled back. The rollback will not occur and the database will be left in an inconsistent state. Rollback journals might be deleted for any number of reasons:

The last (fourth) bullet above merits additional comment. When SQLite creates a journal file on Unix, it opens the directory that contains that file and calls fsync() on the directory, in an effort to push the directory information to disk. But suppose some other process is adding or removing unrelated files to the directory that contains the database and journal at the moment of a power failure. The supposedly unrelated actions of this other process might result in the journal file being dropped from the directory and moved into "lost+found". This is an unlikely scenario, but it could happen. The best defenses are to use a journaling filesystem or to keep the database and journal in a directory by themselves.

For a commit involving multiple databases and a super-journal, if the various databases were on different disk volumes and a power failure occurs during the commit, then when the machine comes back up the disks might be remounted with different names. Or some disks might not be mounted at all. When this happens the individual file journals and the super-journal might not be able to find each other. The worst outcome from this scenario is that the commit ceases to be atomic. Some databases might be rolled back and others might not. All databases will continue to be self-consistent. To defend against this problem, keep all databases on the same disk volume and/or remount disks using exactly the same names after a power failure.

7.0 Transaction Control At The SQL Level

The changes to locking and concurrency control in SQLite version 3 also introduce some subtle changes in the way transactions work at the SQL language level. By default, SQLite version 3 operates in autocommit mode. In autocommit mode, all changes to the database are committed as soon as all operations associated with the current database connection complete.

The SQL command "BEGIN TRANSACTION" (the TRANSACTION keyword is optional) is used to take SQLite out of autocommit mode. Note that the BEGIN command does not acquire any locks on the database. After a BEGIN command, a SHARED lock will be acquired when the first SELECT statement is executed. A RESERVED lock will be acquired when the first INSERT, UPDATE, or DELETE statement is executed. No EXCLUSIVE lock is acquired until either the memory cache fills up and must be spilled to disk or until the transaction commits. In this way, the system delays blocking read access to the file until the last possible moment.

The SQL command "COMMIT" does not actually commit the changes to disk. It just turns autocommit back on. Then, at the conclusion of the command, the regular autocommit logic takes over and causes the actual commit to disk to occur. The SQL command "ROLLBACK" also operates by turning autocommit back on, but it also sets a flag that tells the autocommit logic to rollback rather than commit.

If the SQL COMMIT command turns autocommit on and the autocommit logic then tries to commit change but fails because some other process is holding a SHARED lock, then autocommit is turned back off automatically. This allows the user to retry the COMMIT at a later time after the SHARED lock has had an opportunity to clear.

This page last modified on 2025-08-07 13:08:22 UTC

百度