ndis lwf driver binding removed during windows upgrade

Hi experts

We are facing a peculiar problem.

With UEFI BIOS and secure boot enabled

During Windows 10 upgrade to 1607 version

our ndis lwf driver bindings are getting removed.

–> what could be possible cause?
–> during upgrade the machine restarts multiple times,how can I debug the issue.
–> The current upgrade logs are of no help.Any idea which logs to enable in this case.

Any type of help is highly appreciable.

The problem is only reproduced if UEFI Bios and secure boot enabled.

I am not able to understand the relation of ndis lwf bindings and secure boot.

In any case the ndis lwf driver is WHQL signed.

Even a reinstall after upgrade works.

But removal of bindings during upgrade we are not able to find root cause.

I own the code that migrates NDIS LWF drivers during a Windows Upgrade. I am pretty darn sure that there’s nothing in that code that looks at Secure Boot state. So I don’t have an exact solution for you. What I can instead offer is a bit of background on how this is supposed to work (so you’ll be better-equipped to notice anything unusual) and tell you where the good log files are.

When it comes to LWFs, Windows has 2 separate databases. PNP owns the physical driver package (the INF and .SYS file); while netcfg owns the logical registration of your driver with the network stack. So there’s two migrations that have to happen, and two places to check for log files.

First, PNP migrates all driver packages. You can check whether PNP even knows about your driver package anymore, after the OS upgrade. Compare the output of “pnputil.exe -e” before & after the upgrade, and see if your driver package is getting discarded by PNP’s driver store.

You can also check PNP’s log files for mention of your driver; they’re usually pretty good about logging any problems. Grep c:\windows\inf\setupapi.*.log for your INF, zeroing in on events by timestamp. You should see some chatter about migrating your driver package.

If pnputil.exe -e knows about your driver package, then maybe netcfg couldn’t migrate it. For each LWF driver on the downlevel OS, netcfg calls INetCfgClassSetup::Install again once the system boots into the uplevel OS.

Netcfg writes out C:\windows\system32\netsetupmig.log while it’s doing that migration. You should see lines like:
02:57:38: Migrating your_driver_name.
02:57:38: Attempting to re-install your_driver_name via its INF
02:57:38: Merging your_driver_name.
. . .
02:57:40: Committing the NetSetup transaction.
02:57:40: Migration succeeded.

Jeffrey Tippet wrote:

> Netcfg writes out C:\windows\system32\netsetupmig.log while it’s doing that migration.

Thanks for the useful information, Jeffrey Tippet. We’re facing a
similar case where upgrade from Windows 10 1607 (RS1) or earlier to
Windows 10 1703 (RS2) results in the removal of our NetClient-class
component “in some cases” and “for reasons unknown”.

Similar to what akohli_2004 reported, we see “machines that upgrade
fine, but others don’t.” It’s not specifically Secure Boot in our
case, though. But for example, our testing was showing a different
outcome on domain-joined machines versus non-domain-joined.

But our customers have also reported “nothing is lost” in the same
configurations where we see 100% failure rate in testing inside our
lab, so there is certainly some unidentified variable in the outcome.

Note we do experience this as “new to Windows 10 1703.” We do not
face any similar problems upgrading one Windows 10 build to the next
when going between 1507, 1511 or 1607 (TH1, TH2, RS1). The problem
also didn’t happen in build 15058 or earlier of the RS2 insider
preview builds, and first occurred in the 1506x March builds.

The setupapi logs we’ve reviewed have never suggested any kind of
failure in what SETUPAPI was attempting to do with our driver. The
netsetupmig.log in one of our failure scenarios shows a progression
like I’ve included at the end of this post.

The log shows we’re seeing ERROR_MOD_NOT_FOUND (0x8007007E) during the
first “Attempting to re-install NV_NVCLIENT via its INF”, followed by
ERROR_TIMEOUT (0x800705B4) on the next attempt, followed by 0x800106D9
(presumably RPC failure of some kind?) on the next attempt, followed
by two occurrences of ERROR_SERVICE_MARKED_FOR_DELETE (0x80070430) on
the final attempts.

ERROR_SERVICE_MARKED_FOR_DELETE fits with one of the “variable
outcomes” we seem to experience. When the 1703 upgrade is “finished”
in that it allows the user to logon to Windows for the first time,
when things are “broken” we find our drivers all still marked
SERVICE_DISABLED (0x4) as the start type.

But then once you finally decide to reboot the 1703 machine and logon
for a second time, now our drivers are simply /gone/ from
CurrentControlSet. So a pending delete definitely seemed to be in
play. But it’s as though this decision wasn’t made until /after/ the
upgrade was “finished” and we were allowed to logon. (Since the
deletion didn’t actually occur until the second reboot after
completion of the upgrade.)

Does the initial processing shown in the NETSETUP log happen at a time
when I should be successful in running Process Monitor or similar to
capture a broad view of what is being attempted when NETSETUP
encounters the original ERROR_MOD_NOT_FOUND condition? Or is it
happening at a time when upgrade will have disabled most things and
rebooted into the controlled upgrade environment. Or maybe just some
further granularity / verbosity to the NETSETUP log I can enable, that
might reveal more info.

> 12:10:57: Getting Client Drivers
> 12:10:57: Adding NV_NVCLIENT
> 12:10:57: Adding ms_msclient
> 12:10:57: Getting Binding Paths
> 12:10:57: Writing graph to uplevel.
> 12:10:57: Creating a NetSetup Transaction for migration with environment type 9.
> …
> 12:10:57: Migrating non-PnP enumerated objects.
> 12:10:58: Migrating NV_NVCLIENT.
> 12:10:58: Not migrating ms_msclient.
> 12:10:58: Attempting to re-install NV_NVCLIENT via its INF
> 12:10:59: Err: Failed to install component NV_NVCLIENT with error 8007007E.
> 12:10:59: Err: Clearing MigrationComplete value due to component installation failure
> …
> 12:10:59: Migrating NV_NVCLIENT.
> 12:10:59: Not migrating ms_msclient.
> 12:10:59: Not migrating ms_bridge.
> …
> 12:13:51: Getting Client Drivers
> 12:13:51: Adding NV_NVCLIENT
> 12:13:51: Adding ms_msclient
> 12:13:51: Getting Binding Paths
> 12:13:51: Writing graph to uplevel.
> 12:13:51: Creating a NetSetup Transaction for migration with environment type 9.
> 12:13:51: Preparing to write to transaction.
> 12:13:51: Migrating non-PnP enumerated objects.
> 12:13:51: Migrating NV_NVCLIENT.
> 12:13:51: Not migrating ms_msclient.
> 12:13:51: Attempting to re-install NV_NVCLIENT via its INF
> 12:15:53: Err: Failed to install component NV_NVCLIENT with error 800705B4.
> 12:15:53: Err: Clearing MigrationComplete value due to component installation failure
> …
> 12:15:53: Migrating NV_NVCLIENT.
> 12:15:53: Not migrating ms_msclient.
> 12:15:53: Not migrating ms_bridge.
> …
> 12:16:29: Getting Client Drivers
> 12:16:29: Adding NV_NVCLIENT
> 12:16:29: Adding ms_msclient
> 12:16:29: Getting Binding Paths
> 12:16:29: Writing graph to uplevel.
> 12:16:29: Creating a NetSetup Transaction for migration with environment type 9.
> 12:16:29: Preparing to write to transaction.
> 12:16:29: Migrating non-PnP enumerated objects.
> 12:16:29: Migrating NV_NVCLIENT.
> 12:16:29: Not migrating ms_msclient.
> 12:16:29: Attempting to re-install NV_NVCLIENT via its INF
> 12:16:32: Err: Failed to install component NV_NVCLIENT with error 800106D9.
> 12:16:32: Err: Clearing MigrationComplete value due to component installation failure
> …
> 12:16:32: Migrating NV_NVCLIENT.
> 12:16:32: Not migrating ms_msclient.
> 12:16:32: Not migrating ms_bridge.
> …
> 12:19:30: Adding NV_NVCLIENT
> 12:19:30: Adding ms_msclient
> 12:19:30: Getting Binding Paths
> 12:19:30: Writing graph to uplevel.
> 12:19:30: Creating a NetSetup Transaction for migration with environment type 9.
> 12:19:30: Preparing to write to transaction.
> 12:19:30: Migrating non-PnP enumerated objects.
> 12:19:30: Migrating NV_NVCLIENT.
> 12:19:30: Not migrating ms_msclient.
> 12:19:30: Attempting to re-install NV_NVCLIENT via its INF
> 12:19:31: Err: Failed to install component NV_NVCLIENT with error 80070430.
> 12:19:31: Err: Clearing MigrationComplete value due to component installation failure
> …
> 12:19:31: Migrating NV_NVCLIENT.
> 12:19:31: Not migrating ms_msclient.
> 12:19:31: Not migrating ms_bridge.
> …
> 12:21:17: Getting Client Drivers
> 12:21:17: Adding NV_NVCLIENT
> 12:21:17: Adding ms_msclient
> 12:21:17: Getting Binding Paths
> 12:21:17: Writing graph to uplevel.
> 12:21:17: Creating a NetSetup Transaction for migration with environment type 9.
> 12:21:17: Preparing to write to transaction.
> 12:21:17: Migrating non-PnP enumerated objects.
> 12:21:17: Migrating NV_NVCLIENT.
> 12:21:17: Not migrating ms_msclient.
> 12:21:17: Attempting to re-install NV_NVCLIENT via its INF
> 12:21:25: Err: Failed to install component NV_NVCLIENT with error 80070430.
> 12:21:25: Err: Clearing MigrationComplete value due to component installation failure
> …
> 12:21:25: Migrating NV_NVCLIENT.
> 12:21:25: Not migrating ms_msclient.
> 12:21:25: Not migrating ms_bridge.

Note we’re also seeing a bunch of entries like the following between
each install attempt, but suspect they’re just secondary to the
install failure that occurs first. Omitted the repeats just for
brevity:

> 12:10:59: Migrating bind path {2788AD3C-187C-4850-A5D0-CD31109B725D},ms_tcpip,NV_NVCLIENT
> 12:10:59: The binding path {2788AD3C-187C-4850-A5D0-CD31109B725D},ms_tcpip,NV_NVCLIENT did not correspond to a binding path in NetSetup. Not migrating.
> 12:10:59: Successfully migrated bind path.

Alan Adams
Client for Open Enterprise Server
Micro Focus
xxxxx@microfocus.com

Hmm, that’s a rough log file. We do keep a slightly more detailed log in c:\windows\logs\netsetup\service*.etl . If you send me a direct mail with those attached, I’ll take a look.

The ERROR_MOD_NOT_FOUND is probably coming from an attempt to run the notify object from your driver package. Does that driver have a notify object dll? My guess is that your driver package is not where the OS migration engine expected to find it. I’m surprised that that error apparently eventually resolves itself.

I diffed the migration code in our source control system, and very little changed between 1607 and 1703, so I don’t know offhand what could explain the different results you’re seeing.

You’re right that the warnings about missing bindings are just a cascading issue from not being able to reinstall the driver.

Hi Jeffrey

I have sent you a mail and attached 2 files:

NetSetupMig.log and
service.0.etl.

mail is with subject “ndis lwf driver binding removed during windows upgrade”