Optimizations break my code

I know it sounds like hyperbole but I don’t know what else is going on.

The driver is a isolation encryption filter. In debug build with no optimizations, everything works great. When I build the release version I notice that the decryption is screwed up (data is only partially decrypted). I’m using the BCrypt functions for all crypto.

The only real difference between the debug and release builds are the optimization settings. Once I disable optimizations on the release build, the crypto works as it should just like the debug build.

My code has no #ifdef _DEBUG’s or things like that and I’ve traced it to the BCryptDecrypt function call in my code. Exact same data in but different data out depending on if optimizations are enabled or not. I’ve tried with speed, size and full optimizations and they all seem to break it.

Before I jump into assembly level analysis on the two builds to see what’s going on, I’m hoping someone can maybe point out something obvious I am overlooking. I’ve been working on this for quite awhile now so some fresh thinking wouldn’t hurt.

???, Devbotting.

?? ??? 18 ??? 2017 ?., 5:57:35:

I know it sounds like hyperbole but I don’t know what else is going on.

The only real difference between the debug and release builds are
the optimization settings. Once I disable optimizations on the
release build, the crypto works as it should just like the debug build.

in debug build all local variables are initilized with some values, in
release build they could contain random garbage values. Probably you
have some uninitialized valirable(s) in code and this produce
unpredictable result in release builds because of they have random
values (zero/non zero for boolean for example)

… and you might find you can only use a BCRYPT handle for one operation at
once.

Thanks for the suggestion Mike. Doesn’t appear to be the case here. I’ve re-checked everything in the code path to make sure.

Rod,

Also not the case here but where did you find that info? I don’t see anything in the docs that say as much.

Are you linking to cng.lib from the WDK for both debug and release builds?

On Sun, Jun 18, 2017 at 11:59 AM wrote:

> Thanks for the suggestion Mike. Doesn’t appear to be the case here. I’ve
> re-checked everything in the code path to make sure.
>
> Rod,
>
> Also not the case here but where did you find that info? I don’t see
> anything in the docs that say as much.
>
> —
> NTFSD is sponsored by OSR
>
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer&gt;
></http:>

Yes

I didn?t find that info anywhere. At least nowhere that I could provide a link to. I made the suggestion because I have previously seen code that ?works great? compiled as debug but fails when compiled for release fail because the memory layout changed and what was always a NTS became adjacent to other arbitrary memory ? you can imagine the results I expect.

In addition to memory layout changes, the next most common kind of bug is timing changes. This is especially true for KM code as it is necessarily reenterent and often represents the irreducible complexity in a synchronization domain.

Assuming that none of these common platitudes apply to you, there will be no choice except to delve deeper. You can try assembly level analysis, but if you are anything like me, I loth that work as the next compile can completely invalidate your work. As odd as it sounds, it is better to first guard against compiler errata by ensuring your code can have no ambiguous meanings ? this is harder than you might think ? and then look into memory flow during assembly analysis. As a last resort I would consider bugs in external code

Sent from Mailhttps: for Windows 10

From: xxxxx@gmail.commailto:xxxxx
Sent: June 18, 2017 12:00 PM
To: Windows File Systems Devs Interest Listmailto:xxxxx
Subject: RE:[ntfsd] Optimizations break my code

Thanks for the suggestion Mike. Doesn’t appear to be the case here. I’ve re-checked everything in the code path to make sure.

Rod,

Also not the case here but where did you find that info? I don’t see anything in the docs that say as much.


NTFSD is sponsored by OSR

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></mailto:xxxxx></mailto:xxxxx></https:>

Are you using in the code any sort of construction like:
typedef struct _SOME_GENERIC_HEADER
{
int Type;
int offset;
char Data[1]; //->data is variable in length and can contain different
things depending on the Type field for example
}SOME_GENERIC_HEADER,*PSOME_GENERIC_HEADER;

typedef struct _SOME_SPECIFIC_DATA //-> this whole structure is itself
contained somewhere in ->GENERIC_STRUCT->Data
{
int SomeField;
int NameLen;
char Name[1]; // another variable length variable you may want to fill in
}SOME_SPECIFIC_DATA,*PSOME_SPECIFIC_DATA;

//this code will not work with optimization ( or at least used to not work
for me like 2 years ago )

PSOME_GENERIC_HEADER header = (some pointer);
PUNICODE_STRING name = (some US pointer);

RtlCopyMemory( &((PSOME_SPECIFIC_DATA)((PCHAR)&header->Data[0] +
header->offset))->Name[0]), name->Buffer, name->Length);

From my experience constructions like these break with optimization and
also there was a compiler bug at some point that would not compile this
correctly.

Hope it helps.

Gabriel
www.kasardia.com

On Sun, Jun 18, 2017 at 5:56 PM, Marion Bond wrote:

> I didn’t find that info anywhere. At least nowhere that I could
> provide a link to. I made the suggestion because I have previously seen
> code that ‘works great’ compiled as debug but fails when compiled for
> release fail because the memory layout changed and what was always a NTS
> became adjacent to other arbitrary memory – you can imagine the results I
> expect.
>
>
>
> In addition to memory layout changes, the next most common kind of bug is
> timing changes. This is especially true for KM code as it is necessarily
> reenterent and often represents the irreducible complexity in a
> synchronization domain.
>
>
>
> Assuming that none of these common platitudes apply to you, there will be
> no choice except to delve deeper. You can try assembly level analysis, but
> if you are anything like me, I loth that work as the next compile can
> completely invalidate your work. As odd as it sounds, it is better to
> first guard against compiler errata by ensuring your code can have no
> ambiguous meanings – this is harder than you might think – and then look
> into memory flow during assembly analysis. As a last resort I would
> consider bugs in external code
>
>
>
>
>
> Sent from Mail https: for
> Windows 10
>
>
>
> *From: *xxxxx@gmail.com
> *Sent: *June 18, 2017 12:00 PM
> *To: *Windows File Systems Devs Interest List
> *Subject: *RE:[ntfsd] Optimizations break my code
>
>
>
> Thanks for the suggestion Mike. Doesn’t appear to be the case here. I’ve
> re-checked everything in the code path to make sure.
>
> Rod,
>
> Also not the case here but where did you find that info? I don’t see
> anything in the docs that say as much.
>
> —
> NTFSD is sponsored by OSR
>
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer&gt;
>
>
>
> —
> NTFSD is sponsored by OSR
>
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer&gt;
>


Bercea. G.</http:></http:></https:>

Yes. My FILTER_MESSAGE struct contains an empty array. It’s surrounded by a #pragma warning(disable:4200) to ignore the warning and also a #pragma pack(push, 1). I use this struct to query user mode for the key and IV for the encryption/decryption. Even though I do get partial decryption (the first few bytes of every block are valid), I’m curious if a partially corrupt key or IV caused by the issue you mention would cause that behavior.

In any case, now that you mention it might be an issue I’m going to start wrapping stuff in #pragma optimize(“”, off) to see if I can narrow down the problem.

Thanks for the suggestion.

Also

How do you access the elements of this array? I never rely on the compiler to calculate the correct address using the operator, but always do it manually with pointer math and many casts to and from (UINT_PTR) ? I have been bitten too many times where the apparent type of an unsized array causes element offsets versus byte offsets to get fouled op in ways that are difficult to spot in the source

My rantings of an old man notwithstanding, but I believe you have just found your bug:

You are sharing a structure between UM and KM and when you change compiler settings the data in the struct becomes corrupted. Probably, despite your pragma pack, the layout of this struct alters in release builds. That can happen very easily ? especially if you are using c++ - and would exactly explain your issues. All will work well as long as you have the same settings in bit UM and KM, but when they differ, then you have a problem.

The solution is of course to establish the definitive API / ABI to use ? easier said than done in some cases ? and then ensure that the compiler settings always produce the same contract

It is unlikely that a partially corrupted key of IV would lead to any valid bytes in a decrypted buffer. Modern encryption protocols are designed explicitly to prevent this from happening as the resulting leakage of information could ultimately be used to crack the whole message. It is far more likely that either the buffer length sent to be decrypted is wrong or that the other valid bytes are there but not described correctly

Sent from Mailhttps: for Windows 10

From: xxxxx@gmail.commailto:xxxxx
Sent: June 19, 2017 5:59 PM
To: Windows File Systems Devs Interest Listmailto:xxxxx
Subject: RE:[ntfsd] Optimizations break my code



Yes. My FILTER_MESSAGE struct contains an empty array. It’s surrounded by a #pragma warning(disable:4200) to ignore the warning and also a #pragma pack(push, 1). I use this struct to query user mode for the key and IV for the encryption/decryption. Even though I do get partial decryption (the first few bytes of every block are valid), I’m curious if a partially corrupt key or IV caused by the issue you mention would cause that behavior.

In any case, now that you mention it might be an issue I’m going to start wrapping stuff in #pragma optimize(“”, off) to see if I can narrow down the problem.

Thanks for the suggestion.


NTFSD is sponsored by OSR

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></mailto:xxxxx></mailto:xxxxx></https:>

Winner winner chicken dinner.

Thanks for everyones suggestions. I #pragma’d my way down to a specific function that had some funny struct accesses and now everything works.

Curious though, would this be a compiler bug or just bad programming on my part that I should of known better of how an optimized compiler would act?

> Also not the case here but where did you find that info? I don’t see

anything in the docs that say as much.

School of hard knocks

You do not have to turn off the optimization at all.
All you need to do is just use “intermediate pointers”

Here is what I mean.
From my previous post, instead of addressing the data like:

&((PSOME_SPECIFIC_DATA)((PCHAR)&header->Data[0] + header->offset))->Name[0])

you could do
PSOME_SPECIFIC_DATA SpecificData = ((PCHAR)&header->Data[0] + header->offset));
PCHAR NamePointer = (PCHAR)(SpecificData->Name[0]);
RtlCopyMemory(NamePointer, nameBuffer, nameLength);

If you handle it like this, you can leave the optimization and it should work.

Gabriel
www.kasardia.com

Assuredly this is not a compiler bug.

The c language is great in that it enables programmers to write programs that can be almost directly translated into machine instructions, but it also requires a certain amount of user knowledge. In this case, the problem is in fact language agnostic.

When creating an interface between programs or modules that are compiled separately, regardless of the language / compiler in use, a certain Binary Interface (ABI) is essential. If conditional compilation can alter the ABI, it is not certain and therefore broken.

Sent from Mailhttps: for Windows 10

From: xxxxx@gmail.commailto:xxxxx
Sent: June 19, 2017 10:17 PM
To: Windows File Systems Devs Interest Listmailto:xxxxx
Subject: RE:[ntfsd] Optimizations break my code



Winner winner chicken dinner.

Thanks for everyones suggestions. I #pragma’d my way down to a specific function that had some funny struct accesses and now everything works.

Curious though, would this be a compiler bug or just bad programming on my part that I should of known better of how an optimized compiler would act?


NTFSD is sponsored by OSR

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></mailto:xxxxx></mailto:xxxxx></https:>

It should be needless to point ot that that code

((PCHAR)&header->Data[0]

Can be trivially improved

(PCHAR)(header->Data)

And for pointer math, even better of course

(UINT_PTR)(header->Data) + delta

Where delta is another UINT_PTR value. The resulting UINT_PTR value can hen be cast to the correct pointer type and dereferenced as usual. Optimizations cannot possibly have any effect on this code unless the compiler includes basic sorts of bugs like incorrect integer addition.

Sent from Mailhttps: for Windows 10

From: xxxxx@gmail.commailto:xxxxx
Sent: June 20, 2017 4:45 PM
To: Windows File Systems Devs Interest Listmailto:xxxxx
Subject: RE:[ntfsd] Optimizations break my code

You do not have to turn off the optimization at all.
All you need to do is just use “intermediate pointers”

Here is what I mean.
From my previous post, instead of addressing the data like:

&((PSOME_SPECIFIC_DATA)((PCHAR)&header->Data[0] + header->offset))->Name[0])

you could do
PSOME_SPECIFIC_DATA SpecificData = ((PCHAR)&header->Data[0] + header->offset));
PCHAR NamePointer = (PCHAR)(SpecificData->Name[0]);
RtlCopyMemory(NamePointer, nameBuffer, nameLength);

If you handle it like this, you can leave the optimization and it should work.

Gabriel
www.kasardia.comhttp:


NTFSD is sponsored by OSR

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at http:</http:></http:></http:></mailto:xxxxx></mailto:xxxxx></https:>