Hello,
we use the following (german) sql2008r2 environment
===================================================
Microsoft SQL Server Enterprise Edition (64-bit)
Microsoft Windows NT 6.1 (7601)
NT x64
10.50.2789.0
4-node failover cluster
===================================================
The following procedure is executed by serveral self programmed .net windows services installed on 4 application servers.
CREATE PROCEDURE [dbo].[prc_GetRsnVergabeByTyp]
@Typ VARCHAR(50)
,@Anzahl INT
AS
BEGIN
IF (@Anzahl < 1)
BEGIN
RAISERROR('Value must be greater than 0!', 16, 1);
END;
UPDATE
dbo.RsnVergabe
SET
WertVon = (WertBis + 1)
,WertBis = (WertBis + @Anzahl)
,GeaendertAm = GETDATE()
,GeaendertVon = SUSER_NAME()
OUTPUT
inserted.WertVon
,inserted.WertBis
WHERE
Typ = @Typ;
END
Today the sql-instance with this logic crashed and a cluster failover happened (as expected).
Directly after the instance was online again (on another sql server node) we received 1.000 "primary key" failures within the application.
Our dba found the following message in the windows eventlog:
------------------------------------------------------------
The Transaction (UOW=%1, Description='%3') was unable to be committed, and instead rolled back; this was due to an error message returned by CLFS while attempting to write a Prepare or Commit record for the Transaction. The CLFS error returned was: %4.
------------------------------------------------------------
As a result of team research we think that the procedure was executed by a windows service which received a normal block of 1.000 ids successfully.
The server crashed before the update on dbo.RsnVergabe could be written to disk (ldf and/or mdf). After failover a windows service requested a new block of 1.000 unique ids.
Normaly those ids should have never been sent by the database before. But the sql server returned the same ids that have been returned in the previous request. This resulted in our primary key problem.
Might this be the reason for our problem and how can we prevent this situation in future.
Thanks for any advise.
Best regards
Thorsten Müller
we use the following (german) sql2008r2 environment
===================================================
Microsoft SQL Server Enterprise Edition (64-bit)
Microsoft Windows NT 6.1 (7601)
NT x64
10.50.2789.0
4-node failover cluster
===================================================
The following procedure is executed by serveral self programmed .net windows services installed on 4 application servers.
CREATE PROCEDURE [dbo].[prc_GetRsnVergabeByTyp]
@Typ VARCHAR(50)
,@Anzahl INT
AS
BEGIN
IF (@Anzahl < 1)
BEGIN
RAISERROR('Value must be greater than 0!', 16, 1);
END;
UPDATE
dbo.RsnVergabe
SET
WertVon = (WertBis + 1)
,WertBis = (WertBis + @Anzahl)
,GeaendertAm = GETDATE()
,GeaendertVon = SUSER_NAME()
OUTPUT
inserted.WertVon
,inserted.WertBis
WHERE
Typ = @Typ;
END
Today the sql-instance with this logic crashed and a cluster failover happened (as expected).
Directly after the instance was online again (on another sql server node) we received 1.000 "primary key" failures within the application.
Our dba found the following message in the windows eventlog:
------------------------------------------------------------
The Transaction (UOW=%1, Description='%3') was unable to be committed, and instead rolled back; this was due to an error message returned by CLFS while attempting to write a Prepare or Commit record for the Transaction. The CLFS error returned was: %4.
------------------------------------------------------------
As a result of team research we think that the procedure was executed by a windows service which received a normal block of 1.000 ids successfully.
The server crashed before the update on dbo.RsnVergabe could be written to disk (ldf and/or mdf). After failover a windows service requested a new block of 1.000 unique ids.
Normaly those ids should have never been sent by the database before. But the sql server returned the same ids that have been returned in the previous request. This resulted in our primary key problem.
Might this be the reason for our problem and how can we prevent this situation in future.
Thanks for any advise.
Best regards
Thorsten Müller