We have discovered a very strange anomaly in one of our databases yesterday.
The database is being run on SQL 2000, and has been running problem free for
about 5 years, until an isolated event yesterday.
In simple terms, a transaction that has been committed is "missing" from the
database.
The normal steps for our transaction are as follows.
1. Begin a transaction.
2. Update some records in some tables.
3. Write a new record to the database in another table.
4. If all successful, Commit transaction. Read the data from written in step
3, and print out a hard copy. Included in the hardcopy is the ID field
(primary Key) of the row written in step 3.
5. If any step fails, roll back transaction, give message to the user.
Yesterday, we had a printout generated. The printout happens after the
commit, is a totally separate step, and is read from the data that is
written in step 3. At some point shortly there after, that record has
"disappeared" from the database. (Also, the Updated rows from step 2 were
not updated like they should be. They looked as they would have if the
transaction had never been committed.)
There is no way for that record to be deleted by the user, and I can say
with certainty a sysadmin didn't delete it, as I am the only user with admin
rights to do something like that, and I know I didn't execute a delete on
that row (let alone un-do the updates in step 2 above).
Now, there is where it gets more interesting. At around the same this
happened, we had blocking situation. Essencially, another transaction had
started, and locked a row on an unrelated table. That users network
connection went down before his transaction could be completed. Since the
lock being placed on the table was affecting other users (who needed to read
from that table) we killed the process of that connection.
Now, my questions are:
1. It seems like more than a coincidence that these 2 events happened at the
same time, but they happened at completly different and unrelated tables.
Can anyone think of a reason something like this might happen?
2. How could this have happened at all? I was under the assumption that once
a transaction was committed, there was no way it could not be in the
database?
3. Is there anything I can do to prevent this from happening in the future?
Thanks for any suggestions.
-Ryan> 1. It seems like more than a coincidence that these 2 events happened at
> the same time, but they happened at completly different and unrelated
> tables. Can anyone think of a reason something like this might happen?
Does your application maintain persistent database connections? If that is
the case, consider that the blocking episode could have caused a client-side
timeout. The transaction that was in progress when the timeout occurred
might not have gotten rolled back and all subsequent work done on that
connection was performed in the context of a transaction that was never
committed.
> 2. How could this have happened at all? I was under the assumption that
> once a transaction was committed, there was no way it could not be in the
> database?
A committed transaction is durable. However, note that COMMIT only
decrements @.@.TRANCOUNT; the transaction is not durable until @.@.TRANCOUNT
reaches zero.
> 3. Is there anything I can do to prevent this from happening in the
> future?
One method is to SET XACT_ABORT ON, which will roll back the transaction in
the event of a timeout. This is especially important when you issue BEGIN
TRAN in stored procedure code. I also believe some client APIs have the
smarts to issue a ROLLBACK following a detected exception as long as the
transaction was started via the API rather than a Transact-SQL BEGIN TRAN.
--
Hope this helps.
Dan Guzman
SQL Server MVP
"Ryan" <iQDevelopers@.nospam.nospam> wrote in message
news:ObACMuiVHHA.4832@.TK2MSFTNGP04.phx.gbl...
> We have discovered a very strange anomaly in one of our databases
> yesterday.
> The database is being run on SQL 2000, and has been running problem free
> for about 5 years, until an isolated event yesterday.
> In simple terms, a transaction that has been committed is "missing" from
> the database.
> The normal steps for our transaction are as follows.
> 1. Begin a transaction.
> 2. Update some records in some tables.
> 3. Write a new record to the database in another table.
> 4. If all successful, Commit transaction. Read the data from written in
> step 3, and print out a hard copy. Included in the hardcopy is the ID
> field (primary Key) of the row written in step 3.
> 5. If any step fails, roll back transaction, give message to the user.
> Yesterday, we had a printout generated. The printout happens after the
> commit, is a totally separate step, and is read from the data that is
> written in step 3. At some point shortly there after, that record has
> "disappeared" from the database. (Also, the Updated rows from step 2 were
> not updated like they should be. They looked as they would have if the
> transaction had never been committed.)
> There is no way for that record to be deleted by the user, and I can say
> with certainty a sysadmin didn't delete it, as I am the only user with
> admin rights to do something like that, and I know I didn't execute a
> delete on that row (let alone un-do the updates in step 2 above).
> Now, there is where it gets more interesting. At around the same this
> happened, we had blocking situation. Essencially, another transaction had
> started, and locked a row on an unrelated table. That users network
> connection went down before his transaction could be completed. Since the
> lock being placed on the table was affecting other users (who needed to
> read from that table) we killed the process of that connection.
> Now, my questions are:
> 1. It seems like more than a coincidence that these 2 events happened at
> the same time, but they happened at completly different and unrelated
> tables. Can anyone think of a reason something like this might happen?
> 2. How could this have happened at all? I was under the assumption that
> once a transaction was committed, there was no way it could not be in the
> database?
> 3. Is there anything I can do to prevent this from happening in the
> future?
> Thanks for any suggestions.
> -Ryan
>
No comments:
Post a Comment