SQL Server: Fix: Transaction (Process ID) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction

I have a nightly SQL Agent job that aggregates backup information from system views to power a Power BI dashboard that the DBA’s review periodically for database backup failures.

It is actually a PowerShell job that runs every night in parallel to collect the data from hundreds of instances (both SQL Server and Oracle). As it runs, it maintains the fetch status/result of every operation in database tables which can be consulted if something were to go wrong.

Deadlocks in general:

Recently, the job kept failing with the below error:

Transaction (Process ID) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction

Notice that this is slightly different from a similar error below

Error 1205 : Transaction (Process ID) was deadlocked on resources with another process and has been chosen as the deadlock victim. Rerun the transaction

The difference from what I understand in this StackExchange question is that the former is caused by intra-query parallelism. Unfortunately, I don’t have the deadlock graph to show.

We know what deadlocks are and some of the common reasons they happen. If you need a refresher, I recommend this good article. I am not going to rehash what has already been said although these high level points are worth noting to resolve them:

1) Examine known Parallelism (where you have parallelized jobs)
2) Examine unknown Parallelism (unknown jobs or users interfere with your jobs in parallel)
3) Arrange order of tables doing DML to be the same across all code. E.g., Always Customers first, Orders second, OrderDetails third.
4) Examine the indexes on the affected tables to eliminate full-table scans
5) Reduce the amount of time spent in a transaction
6) Update in chunks especially if updating/deleting across sessions
7) Avoid RBAR (Row By Agonizing Row) CRUD operations! Do statement based mass changes.

My Deadlock case:

Being someone who logs all code obsessively can be a helpful trait at times (here is nice TSQL Logger if you need one). In this case, I knew the exact UPDATE statements where the error was happening. I have copied over the PowerShell snippet where the failure was happening.

if (($resultSetCount -eq 0) -or ($totalRowCount -eq 0))
{
	$sql = "/* -- ProbeOutputRun -- */
		UPDATE ProbeOutputRun
			SET
				Status = 'WARNING',
				StatusMessage = 'Probe succeeded but returned no output!',
				UpdateDate=$curDateTimeString,
				UpdateUser='$($env:username)'
			WHERE ProbeTargetRunId = {0}" -f $setItem.Key

	Invoke-Command -ScriptBlock $sbTrackingEvent `
		-ArgumentList $RepositoryDatabaseType, $RepositoryConnectionString, $sql, 'DataTable', $rowItemProbeRunCOnfig.QueryTimeout, $RepositoryDotNetProviderType
}

There are two other related tables around this update that were also being updated. If I had been running the code serially, I am sure there might not have been a problem as there would not have been a deadlock situation. Since this piece of code could be running simultaneously in 10 other sessions (updating using different key ID’s that are close to each other since they come from a sequence), we have a deadlock situation.

No Index on the WHERE clause column:

To cut to the chase, the problem was because the tables being updated were lacking indexes on the column used in the WHERE clause and had several million rows. Because of the lack of indexes, it has to do a full-scan to do the UPDATE in all 10 sessions which increases the amount of time spent in a transaction (although not explicit). The other related tables being updated also lacked an index on the column referenced in the WHERE clause. This increased time increases the possibility of deadlocks. In my case, I think only one table alone was enough to cause the deadlock (intra-query parallelism) but I could be wrong.

As soon as indexes were added on the column referenced in the WHERE clause, the problem went away! As a nice side-effect, the job also runs faster now.

Conclusion:

Do the things suggested in the bullets above to fix Deadlocks. If that does not fix the problem, capture Deadlocks using one of the methods suggested, analyze the objects involved and do the necessary (adding indexes, changing order etc). Adding logging statements using a TSQL logger can also help supplement.

2 thoughts on “SQL Server: Fix: Transaction (Process ID) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction

Leave a comment