Driver Problems? Questions? Issues?
Put OSR's experience to work for you! Contact us for assistance with:
  • Creating the right design for your requirements
  • Reviewing your existing driver code
  • Analyzing driver reliability/performance issues
  • Custom training mixed with consulting and focused directly on your specific areas of interest/concern.
Check us out. OSR, the Windows driver experts.

Monthly Seminars at OSR Headquarters

East Coast USA
Windows Internals and SW Drivers, Dulles (Sterling) VA, 13 November 2017

Kernel Debugging & Crash Analysis for Windows, Nashua (Amherst) NH, 4 December 2017

Writing WDF Drivers I: Core Concepts, Nashua (Amherst) NH, 8 January 2018

WDF Drivers II: Advanced Implementation Techniques, Nashua (Amherst) NH, 15 January 2018


Go Back   OSR Online Lists > ntdev
Welcome, Guest
You must login to post to this list
  Message 1 of 9  
28 Sep 17 14:30
Chirag Chinmay
xxxxxx@gmail.com
Join Date: 28 Sep 2017
Posts To This List: 3
Size of Data Copy from Memory Mapped Device to Host in Storport Miniport Driver

Hi All, Currently working in StorportMiniport Driver, I have mapped my device BAR through StorportGetDeviceBase. I have to now do a 4KB data transfer from the mapped address of the device to host address. I use StorportCopyMemory to do the data transfer, but while doing that, i can see multiple MemRd32 with length of 2 DWORDS getting copied to the host. Basically for a single memcpy, 4KB/8PCIE transactions happened, whic is a costly operation. Is there any way to increase the memory transfer size greater than 2 DWORDS? That will reduce the number of transactions. Regards, Chirag
  Message 2 of 9  
29 Sep 17 02:18
Tim Roberts
xxxxxx@probo.com
Join Date: 28 Jan 2005
Posts To This List: 11622
Size of Data Copy from Memory Mapped Device to Host in Storport Miniport Driver

On Sep 28, 2017, at 11:30 AM, xxxxx@gmail.com = <xxxxx@lists.osr.com> wrote: >=20 > Currently working in StorportMiniport Driver, I have mapped my device = BAR through StorportGetDeviceBase. I have to now do a 4KB data transfer = from the mapped address of the device to host address. I use = StorportCopyMemory to do the data transfer, but while doing that, i can = see multiple MemRd32 with length of 2 DWORDS getting copied to the host. = Basically for a single memcpy, 4KB/8PCIE transactions happened, whic is = a costly operation. Is there any way to increase the memory transfer = size greater than 2 DWORDS? That will reduce the number of transactions. Not from the host. Think about it from a hardware level: the CPU is = just reading memory. It doesn't have any idea that you're talking to a = bus device, so it can't force any PCIe magic, and the PCIe root complex = has no way to know that you'll be issuing a whole set of reads, so it = can't combine them. The only way to get larger PCIe transfers is to do bus mastered DMA from = the device. Then, you can get up to the maximum packet size for the = bus, usually 128 bytes. However, I strongly suspect you are guilty of premature optimization. = What makes you think this will make a measurable difference? =E2=80=94=20 Tim Roberts, xxxxx@probo.com Providenza & Boekelheide, Inc.
  Message 3 of 9  
01 Oct 17 22:54
Muthazhagan Arulbalasubramani
xxxxxx@gmail.com
Join Date: 23 Jan 2017
Posts To This List: 11
Size of Data Copy from Memory Mapped Device to Host in Storport Miniport Driver

How about modifying PCI Express capability -> Device control register -> "Max_Payload_size" of a TLP . Will it help?
  Message 4 of 9  
02 Oct 17 16:27
Pavel A
xxxxxx@fastmail.fm
Join Date: 21 Jul 2008
Posts To This List: 2401
Size of Data Copy from Memory Mapped Device to Host in Storport Miniport Driver

> How about modifying PCI Express capability -> Device control register -> "Max_Payload_size" of a TLP . Will it help? This affects only DMA transfers. Not helpful for PIO accesses. -- pa
  Message 5 of 9  
02 Oct 17 19:20
Tim Roberts
xxxxxx@probo.com
Join Date: 28 Jan 2005
Posts To This List: 11622
Size of Data Copy from Memory Mapped Device to Host in Storport Miniport Driver

On Oct 1, 2017, at 7:55 PM, xxxxx@gmail.com <xxxxx@lists.osr.com> = wrote: >=20 > How about modifying PCI Express capability -> Device control = register -> "Max_Payload_size" of a TLP . Will it help?=20 Besides Pavel's answer, which is correct, you are not allowed to modify = that register. The max payload size is negotiated at enumeration time. = The root complex has an upper limit, and the device has an upper limit. = The register is set to the smaller of the two values, and the connection = won't work if you force it larger. I tried to explain why this can't work from a host. At the bus = transaction level, no one knows that this is going to be a long = transfer, so no one can make a bigger transfer. If you need this, you = will have to do DMA from the device. =E2=80=94=20 Tim Roberts, xxxxx@probo.com Providenza & Boekelheide, Inc.
  Message 6 of 9  
04 Oct 17 05:02
Chirag Chinmay
xxxxxx@gmail.com
Join Date: 28 Sep 2017
Posts To This List: 3
Size of Data Copy from Memory Mapped Device to Host in Storport Miniport Driver

Can I do a system mastered DMA to the device to read the data from the device into a host buffer?
  Message 7 of 9  
04 Oct 17 19:36
Tim Roberts
xxxxxx@probo.com
Join Date: 28 Jan 2005
Posts To This List: 11622
Size of Data Copy from Memory Mapped Device to Host in Storport Miniport Driver

On Oct 4, 2017, at 2:03 AM, xxxxx@gmail.com = <xxxxx@lists.osr.com> wrote: >=20 > Can I do a system mastered DMA to the device to read the data from the = device into a host buffer?=20 In general, no. The DMA controller on most motherboards today is a = legacy from the old ISA bus, useless for any general-purpose activities. I understand that one of the newer Intel chipsets includes a system DMA = controller, but I can't find the reference now, and I don't know if = there is operating system support for it. Do you have any real evidence that your performance is not good enough = as is? Or are you just guessing? I am not a disk driver guy, but the Microsoft documentation seems to say = that Storport drivers are required to support DMA-based I/O. Does your = hardware really not do DMA? =E2=80=94=20 Tim Roberts, xxxxx@probo.com Providenza & Boekelheide, Inc.
  Message 8 of 9  
05 Oct 17 03:25
Chirag Chinmay
xxxxxx@gmail.com
Join Date: 28 Sep 2017
Posts To This List: 3
Size of Data Copy from Memory Mapped Device to Host in Storport Miniport Driver

The reason why we don't want to do DMA from device is because we are doing some analysis for some cases in which host does the transfer and not the device. This is the actual Problem that we are facing: Data copy is done in the DPC. During data copy from device to host, using StorPortMoveMemory, the DPC runs for a long time than its scheduled maximum of 100 microsecs. This causes the OS to hang. For this reason wanted to do a DMA from host so as to avoid stalling in the processor. Approach 1 : 1. Using StorportAllocateMDL to create MDL for Device buffer. 2. Using StorPortBuildScatterGatherList to do DMA. Approach 2: 1. Using IoGetDmaAdapter to get DmaAdapter from PDO 2. Using InitializeDmaTransferContext to initialize the DMA Transfer 3. Using AllocateAdapterChannelEx to allocate resources for DMA Transfer 4. Using MapTransferEx to do the DMA Is any of the approach correct? If yes, is any of the approaches feasible in StorPort Miniport Driver? If not, are there any more DMA API's which can be used to do the same? Thanks! Chirag
  Message 9 of 9  
05 Oct 17 13:57
Tim Roberts
xxxxxx@probo.com
Join Date: 28 Jan 2005
Posts To This List: 11622
Size of Data Copy from Memory Mapped Device to Host in Storport Miniport Driver

xxxxx@gmail.com wrote: > The reason why we don't want to do DMA from device is because we are doing some analysis for some cases in which host does the transfer and not the device. Well, I think your analysis is complete.  When the host does the transfer, the TLP size will never be larger than 4.   > This is the actual Problem that we are facing: > Data copy is done in the DPC. During data copy from device to host, using StorPortMoveMemory, the DPC runs for a long time than its scheduled maximum of 100 microsecs. This causes the OS to hang. For this reason wanted to do a DMA from host so as to avoid stalling in the processor. That has very little to do with TLP size and more to do with raw bandwidth.  Regardless of TLP size, 100us is only enough time to copy 100kB over a PCIe 3.0 x1 device.  If you need to copy more, you either need to defer it to a lower-priority thread, or do it with device DMA. > Approach 1 : > 1. Using StorportAllocateMDL to create MDL for Device buffer. > 2. Using StorPortBuildScatterGatherList to do DMA. I assume you are just abbreviating here, and that you realize StorPortBuildScatterGatherList doesn't actually do DMA.  It just gives you the page numbers you need to create DMA descriptors for your hardware's DMA engine.  You still have to set up and fire that operation yourself. Assuming the device is doing the DMA, you don't need an MDL for the device buffer.  The device knows its own destination, and device memory is always physically contiguous.  The scatter/gather list is for the system memory buffer. > Approach 2: > 1. Using IoGetDmaAdapter to get DmaAdapter from PDO > 2. Using InitializeDmaTransferContext to initialize the DMA Transfer > 3. Using AllocateAdapterChannelEx to allocate resources for DMA Transfer > 4. Using MapTransferEx to do the DMA My same "abbreviation" comment applies to MapTransferEx.  What you've described here is essentially what StorPorBuildScatterGatherList does, but without using kernel-specific APIs. > Is any of the approach correct? If yes, is any of the approaches feasible in StorPort Miniport Driver? If not, are there any more DMA API's which can be used to do the same? StorPortBuildScatterGatherList should be fine.  You just need to build your descriptors and fire the DMA. -- Tim Roberts, xxxxx@probo.com Providenza & Boekelheide, Inc.
Posting Rules  
You may not post new threads
You may not post replies
You may not post attachments
You must login to OSR Online AND be a member of the ntdev list to be able to post.

All times are GMT -5. The time now is 02:21.


Copyright ©2015, OSR Open Systems Resources, Inc.
Based on vBulletin Copyright ©2000 - 2005, Jelsoft Enterprises Ltd.
Modified under license