Wednesday, November 1, 2017

Replication error...yes...transaction replication is alive and well!

Oh boy this was a fun one! So where I work now we have transactional replication set up between an OLTP and DataWarehouse database, in this case spread across two instances but in many cases on the same instance.

I get an urgent call in the middle of the night that client X can't run a report, it's returning 0's instead of information. Yes, this call came at 11:40 PM, yes it has to be fixed right now. *sigh*. OK, it IS what I signed up for. Anyways! Now I've seen this error before when replication is down, so easy fix...let's go look at the Replication Monitor. Hmm...it's up...all green. Let's reinitialize it from a fresh snapshot, shouldn't take too long.

Everything looks green in the Repl monitor.

Re-run the report.

Same thing. What?

Let's dig deeper. 

I didn't see anything in the monitor:




(dummy screenshot from http://www.sqlservercentral.com/articles/Stairway+Series/72451/ to protect the innocent)

I needed to dig in deeper to see the error; it wasn't until I got into the LogReader Agent Properties that I found the issue (open your server in Replication Monitor, go to your publication, right-click on LogReader Agent and then to Properties). Here is what I see on the error details:







What IS this?

So I dug around online to see what I could find to figure out exactly what the issue is. This is what I came up with:

USE distribution
GO

DECLARE @SeqNo varbinary(16),
  @SeqNoChar nchar(30);

SET @SeqNo = 0x000D12E200003C74000700000000;
SET @SeqNoChar = '0x000D12E200003C74000700000000';

SELECT *
FROM dbo.MSarticles
WHERE article_id IN (SELECT article_id
  FROM MSrepl_commands
  WHERE xact_seqno = @SeqNo)

EXEC sp_browsereplcmds
@xact_seqno_start = @SeqNoChar,
@xact_seqno_end = @SeqNoChar

What?!? Well, take the Transaction Sequence # from the error and plug it into the script, this will show you what wasn't able to be transferred over. At this point you can do the following (from https://support.microsoft.com/en-us/help/3066750/how-to-troubleshoot-error-20598-the-row-was-not-found-at-the-subscribe):


  • Manually insert the missing row at the subscriber. This may enable the Distribution Agent to retry the failed command and move forward with the replication. 

    Note There may be other rows that are missing and that have to be manually inserted at the subscriber if there are more failures.
  • Instruct the Distribution Agent to skip this error and continue to replicate the rest of the changes. The Distribution Agent accepts the skiperrors parameter. You can use this parameter to pass error code 20598. This may keep the replication setup intact while you wait for an opportunity to manually synchronize the missing rows. 
As it was very late (early?) at this point, the error had stopped replication for over 3 days (and there was no guarantee that inserting the record would allow replication to catch up), and they needed their reports run (*sigh*) replication was rebuilt. From scratch. Dropped all publishers and subscribers, dropped the Distribution DB, and set it up fresh. All is well now, although I missed out on some much-needed beauty sleep!

Friday, July 7, 2017

How to REALLY solve DbProviderFactories section can only appear once per config file error


Had this issue after installing SQL 2017, then installing SQL 2012 side-by-side and trying to create an SSIS package to load data from flat files to SQL. No idea if the dual installation caused it but it definitely caused many hours of frustration!

Many places on the web will have you only going to ONE folder to fix the machine.config file, took me much digging to find out there are two folders to check!

C:\Windows\Microsoft.NET\Framework\v4.0.30319\Config
C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Config

So here's how to do it:
First, go to one of the two folders listed above and you'll see the different versions. First go to the folder v4.0.30319 and open the CONFIG folder > Machine.Config file in a text editor (NOT WORD, use Notepad or something similar....oh hell, just use Notepad to be safe). In that file, search for the string “<DbProviderFactories/>”. If you found that, you'll see something similar to this (please note that this can be an issue with ANY provider, I'm showing DB2 here):




<system.data>
    <DbProviderFactories>
        <add name="IBM DB2 for i5/OS .NET Provider" invariant="IBM.Data.DB2.iSeries" description=".NET Framework Data Provider for i5/OS" type="IBM.Data.DB2.iSeries.iDB2Factory, IBM.Data.DB2.iSeries, Version=12.0.0.0, Culture=neutral, PublicKeyToken=9cdb2ebfb1f93a26"/>
        <add name="Microsoft SQL Server Compact Data Provider" invariant="System.Data.SqlServerCe.3.5" description=".NET Framework Data Provider for Microsoft SQL Server Compact" type="System.Data.SqlServerCe.SqlCeProviderFactory, System.Data.SqlServerCe, Version=3.5.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91"/>
    </DbProviderFactories>
    <DbProviderFactories/>
</system.data>


The <DbProviderFactories/> doesn’t belong there so delete this line:

<DbProviderFactories/>

Save it and look in the other folders of the other versions for that line. It’s possible that this line also appears in v2.0.50727. Don't forget to check both folders (Framework and Framework64)!