本帖最后由 FPGA课程 于 2024-9-12 09:27 编辑
软件版本:VIVADO2021.1
操作系统:WIN10 64bit
硬件平台:适用 XILINX A7/K7/Z7/ZU/KU 系列 FPGA
实验平台:米联客-MLK-H3-CZ08-7100开发板
板卡获取平台:https://milianke.tmall.com/
登录“米联客”FPGA社区 http://www.uisrc.com 视频课程、答疑解惑!
1概述本方案基于XDMAIP搭建FPGA工程,并且以“01WIN系统PCIE驱动编译”中已经编译好的驱动和测试程序为演示demo。 本方案内容作为通用的教程内容,适合XILINX各类支持PCIE通信的板卡。并且米联客在XDMA中使用了自己编写的FDMA控制IP,可以简单方便的完成数据之间的交换。 2系统构架本系统中演示的关键在于我们编写了一个uixdmairq的IP。该用来配合驱动处理中断,uixdmairq提供了AXI-LITE接口,上位机通过访问user空间地址读写uixdmairq的寄存器。该IP在user_irq_req_i输入的中断位,寄存中断位号,并且输出给XDMAIP,当上位机的驱动响应中断的时候,在中断里面写uixdmairq的寄存器,清除已经处理的中断。 另外本方案中通过AXI-BRAM来演示用户user空间的读写访问测试。
3XMDA概述Xilinx提供的DMASubsystemforPCIExpressIP是一个高性能,可配置的适用于PCIE2.0,PCIE3.0的SG模式DMA,提供用户可选择的AXI4接口或者AXI4-Stream接口。一般情况下配置成AXI4接口可以加入到系统总线互联,适用于大数据量异步传输,通常情况都会使用到DDR,AXI4-Stream接口适用于低延迟数据流传输。 XDMA是SGDMA,并非BlockDMA,SG模式下,主机会把要传输的数据组成链表的形式,然后将链表首地址通过BAR传送给XDMA,XDMA会根据链表结构首地址依次完成链表所指定的传输任务。
AXI4、AXI4-Stream,必须选择一个,用于数据传输 AXI4-LiteMaster可选,用于实现PCIEBAR地址到AXI4-lite寄存器地址的映射,可以用于读写用户逻辑寄存器。 AXI4-LiteSlave可选,用来将XDMA内部寄存器开放给用户逻辑,用户逻辑可以通过此接口访问XDMA内部寄存器,不会映射到BAR。 AXI4Bypass接口,可选,用来实现PCIE直通用户逻辑访问,可用于低延迟数据传输。 4基于XDMA的PCIEFPGA工程搭建
4.1XDMAIP配置
1:添加XDMAIP核
2:配置XDMA IP
双击XDMA IP进行配置 Mode:配置模式,选择 BASE配置 Lane Width:选择PCIE的通道数量对于MZ7035FC为8个通道,每个开发板支持通道数量不一样,通道数量越多通信速度越快,用户需要根据硬件的实际通达数量选择正确的通道数。 Max Link Speed:选择5.0GT/s 即PCIE2.0,对于ultrascale或者ultrascale+的FPGA可以支持PCIE3.0实现更高速度Reference Clock :100MHZ,参考时钟 100M DMA Interface Option:接口选择 AXI4 接口 AXI Data Width:128bit,即 AXI4 数据总线宽度为128bit AXI Clock :250M,即AXI4 接口时钟为250MHZ DMA Interface option 设置为AXI Memory Mapped方式
PCIE ID 配置,这里选择默认的配置就可以,默认的设备类型是Simple communication controllers
PCIE BAR 配置,这里面的配置比较重要,首先使能 PCIE to AXI Lite Master Interface ,这样可以在主机一侧通过PCIE 来访问用户逻辑侧寄存器或者其他 AXI4-Lite 总线设备映射空间选择 1M,当然用户也可以根据实际需要来自定义大小。 PCIE to AXI Translation:这个设置比较重要,通常情况下,主机侧PCIE BAR 地址与用户逻辑侧地址是不一样的, 这个设置就是进行BAR 地址到AXI 地址的转换,比如主机一侧 BAR 地址为0,IP 里面转换设置为0x44A00000, 则主机访问 BAR 地址 0 转换到AXI LIte 总线地址就是0x44A00000
PCIE to DMA Interface :选择64bit 使能
DMA Bypass 暂时不用
PCIE 中断设置
User Interrupts:用户中断,XDMA提供16条中断线给用户逻辑,这里面可以配置使用几条中断线。
Legacy Interrupt:XDMA支持Legacy中断,我们这么不选
MSI Capabilities:选择支持MSI中断 ,支持4个中断消息向量
注意:MSI 中断和 MSI-X 中断只能选择一个,否则会报错,如果选择了 MSI 中断,则可以选择 Legacy 中断, 如果选择了 MSI-X 中断,那么 MSI 必须取消选择,同时Legacy 也必须选择None。此 IP 对于7 系列设置有这个问题,如果使用Ultrascale 系列,则可以全部选择
MSI-X Capabilities:不选
Miscellaneous:选Extended Tag Field
Link Status Register:选Enable Slot Clock Configuration
配置DMA 相关内容
Number of DMA Read Channel(H2C)和Number of DMA Write Channel(C2H)通道数,对于PCIE2.0 来说最大只能选择2,也就是 XDMA 可以提供最多四个独立的写通道和四个独立的读通道,独立的通道对于实际应用中有很大的作用,在带宽允许的前提前,一个PCIE 可以实现多种不同的传输功能,并且互不影响。这里我们选择2 Number of Request IDs for Read (Write)channel :这个是每个通道设置允许最大的 outstanding 数量,按照默认即可
4.2完成自动连线
配置完成以后,点击 Run Block Auto,可以看到之前的配置信息,如果有发现和目标配置不一样的,需要手动 修改,点击 OK,完成配置。
配置完成后,VIVADO会自动进行必要的连线
到此为止,XDMA IP 配置就完成了。为了让XDMA和上位机可以密切配和工作,我们还要继续搭建其他部分的功能模块。 4.3基于图形设计的XDMA工程
4.4添加中断测试代码- `timescale 1ns / 1ps
- /*******************************MILIANKE*******************************
- *Company : MiLianKe Electronic Technology Co., Ltd.
- *WebSite:https://www.milianke.com
- *TechWeb:https://www.uisrc.com
- *tmall-shop:https://milianke.tmall.com
- *jd-shop:https://milianke.jd.com
- *taobao-shop1: https://milianke.taobao.com
- *Create Date: 2021/10/15
- *File Name: pcie_top.v
- *Description:
- *Declaration:
- *The reference demo provided by Milianke is only used for learning.
- *We cannot ensure that the demo itself is free of bugs, so users
- *should be responsible for the technical problems and consequences
- *caused by the use of their own products.
- *Copyright: Copyright (c) MiLianKe
- *All rights reserved.
- *Revision: 1.0
- *Signal description
- *1) _i input
- *2) _o output
- *3) _n activ low
- *4) _dg debug signal
- *5) _r delay or register
- *6) _s state mechine
- *********************************************************************/
- module pcie_top
- (
- output [14:0]DDR3_addr,
- output [2:0]DDR3_ba,
- output DDR3_cas_n,
- output [0:0]DDR3_ck_n,
- output [0:0]DDR3_ck_p,
- output [0:0]DDR3_cke,
- output [0:0]DDR3_cs_n,
- output [7:0]DDR3_dm,
- inout [63:0]DDR3_dq,
- inout [7:0]DDR3_dqs_n,
- inout [7:0]DDR3_dqs_p,
- output [0:0]DDR3_odt,
- output DDR3_ras_n,
- output DDR3_reset_n,
- output DDR3_we_n,
- input sysclk_p,
- input sysclk_n,
-
- input [7 :0]pcie_mgt_rxn,
- input [7 :0]pcie_mgt_rxp,
- output [7 :0]pcie_mgt_txn,
- output [7 :0]pcie_mgt_txp,
- input [0 :0]pcie_ref_clk_n,
- input [0 :0]pcie_ref_clk_p,
- input pcie_rst_n
- );
- wire sysclk;
- IBUFDS CLK_U(.I(sysclk_p),.IB(sysclk_n),.O(sysclk));
- wire axi_aclk;
- wire user_irq_en_o;
- reg [21:0]timer_cnt;
- reg timer_r1,timer_r2;
- reg [1:0]int_p;
- reg [3:0]user_irq_req_i;
- wire inter = !timer_r2 && timer_r1;
- always @(posedge axi_aclk)begin
- if(user_irq_en_o == 1'b0)begin
- timer_cnt <= 22'd0;
- end
- else begin
- timer_cnt <= timer_cnt + 1'b1;
- end
- end
- always @(posedge axi_aclk)begin
- if(user_irq_en_o == 1'b0)begin
- timer_r1 <= 1'd0;
- timer_r2 <= 1'd0;
- end
- else begin
- timer_r1 <= timer_cnt[20];
- timer_r2 <= timer_r1;
- end
- end
- always @(posedge axi_aclk)begin
- if(user_irq_en_o == 1'b0)begin
- int_p[1:0] <= 4'd0;
- user_irq_req_i <= 4'd0;
- end
- else begin
- if(inter) int_p <= int_p + 1'b1;
- user_irq_req_i <= 4'd0;
- user_irq_req_i[int_p] <= 1'b1;
- end
- end
- pcie_system pcie_system_i(
- .DDR3_addr(DDR3_addr),
- .DDR3_ba(DDR3_ba),
- .DDR3_cas_n(DDR3_cas_n),
- .DDR3_ck_n(DDR3_ck_n),
- .DDR3_ck_p(DDR3_ck_p),
- .DDR3_cke(DDR3_cke),
- .DDR3_cs_n(DDR3_cs_n),
- .DDR3_dm(DDR3_dm),
- .DDR3_dq(DDR3_dq),
- .DDR3_dqs_n(DDR3_dqs_n),
- .DDR3_dqs_p(DDR3_dqs_p),
- .DDR3_odt(DDR3_odt),
- .DDR3_ras_n(DDR3_ras_n),
- .DDR3_reset_n(DDR3_reset_n),
- .DDR3_we_n(DDR3_we_n),
-
- .pcie_mgt_rxn(pcie_mgt_rxn),
- .pcie_mgt_rxp(pcie_mgt_rxp),
- .pcie_mgt_txn(pcie_mgt_txn),
- .pcie_mgt_txp(pcie_mgt_txp),
- .pcie_ref_clk_n(pcie_ref_clk_n),
- .pcie_ref_clk_p(pcie_ref_clk_p),
- .pcie_rst_n(pcie_rst_n),
- .axi_aclk(axi_aclk),
- .user_irq_en_o(user_irq_en_o),
- .user_irq_req_i(user_irq_req_i),
- .sysclk(sysclk)
- );
-
-
- endmodule
复制代码
4.5地址分配
进行地址分配:
这里我们把挂在M_AXI上的DDR地址分配从0开始(对于widnows系统必须为0),M_AXI是需要进行DMA操作的。而M_AXI_LITE挂载的BRAM和uixdmairq中断控制单元是映射到用户BAR地址空间,这个地址就是前面我们XDMA IP里面设置的地址。默认情况下,需要设置uixdmairq中断控制单元的地址和XDMA里面设置的用户BAR地址空间一致。如下图0x44A0_0000。BRAM的地址空间0x44A01_0000。关于地址空间的具体含义,结合软件的使用会更加清晰,初学者暂且根据教程设置。
5硬件安装注意确保TF卡里面没有程序吗,或者拔掉TF卡。 先下载程序,调试阶段下载bit文件,然后再开电脑。这样才能正确识别和后续测试工作正常开展。
6硬件识别
硬件识别失败自查方法:
7应用程序测试
7.1xdma_rw.exe功能介绍我们打开一个终端(如果双击运行会很快退出来),进入到上一节编译生成的应用程序目录找到xdma_rw.exe,这个应用程序是操作pcie的所有设备的,我们在终端只输入xdma_rw.exe,可以看到提示信息,告诉用户这个程序如何使用。
1:DEVNODE
control代表控制通道,用于控制XDMA的寄存器,由于精力原因,对于控制通道对XDMA寄存器的设置米联客没有深入研究。 event_*代表中断事件,其中*代表那中断号的中断事件 user代表用户空间,数据走AXI4-LITE接口 h2c_*代表hosttocardPC发送DMA数据到板卡,其中*代表那个通道,一般只用通道0,数据走AXI4-FULL通道 c2h_*代表cardtohost板卡发数据到PC,其中*代表那个通道,一般只用通道0,数据走AXI4-FULL通道 2:ADDR重点知识
读写地址偏移,对于DMA通道地址都是从0开始,但是对于PSDDR内存,必须偏移至少20MB开始读写PSDDR 对于user的读写,偏移地址是axi-lite接口的IP地址,减去在XDMAIP中配置的PCIEtoAXITranslation地址。对于米联客的XDMA方案由于修改了驱动中对于中断的响应,所以PCIEtoAXITranslation必须和默认的uixdmairq地址一致。之后再分区其他AXI-LITE接口外设。
3:OPTION
a设置内存对齐 b打开一个二进制文件 f读或者写文件 l数据长度 v更详细的输出 4:DATA
十进制或者十六进制数,必须用空格间隔 如: 17345168 0x110x220x330x44 7.2DMA批量数据测试DMA传输是我们用的最多的一种,需要用到h2c或者ch2通道。 可以看到在当前目录有一个datafile4k.bin文件,那就测试一下将这个文件传输到FPGA的DDR或者(MA703FA-35T是BRAM),然后读出来。 在终端输入指令: xdma_rw.exeh2c_0write0x0000000-b-fdatafile4K.bin-l4096 意思就是使用h2c_0设备以二进制的形式读取文件datafile4k.bin写入到BRAM内存地址0x0000000长度为4096字节。
使用命令 xdma_rw.exec2h_0read0x0000000-b-fdatafile4K_recv.bin-l4096
接下来我们可以使用winhex等软件来检查一下两个文件数据是否一直,经过检查,是一致的则说明传输功能正常。
7.3user通道测试通过AXI-LITE接口写2个数据到挂在AXI-LITE接口的BRAM中 xdma_rw.exeuserwrite0x100000x110x22 通过AXI-LITE接口读2个数据到挂在AXI-LITE接口的BRAM中 xdma_rw.exeuserread0x10000-l2 7.4event中断测试
1:XDMA中断FPGA部分代码
首先我们要理解下XDMA的中断类型,以及控制时序: 1)、LegacyInterrupts: 对于LegacyInterrupts中断,当user_irq_ack第一次为1的时候usr_irq_req可以清0,当user_irq_ack第二次为1的时候,可以重新设置usr_irq_req发起中断。 在PCI总线里面INTx中断是由四条可选的中断线决定的,这种中断方式是共享式的,所有的pci设备把中断信号在一条中断线上相与,再上报给cpu,cpu收到中断以后再查询具体是哪个设备产生了中断。 在PCIE总线里面已经没有了实体的INTx物理中断线了,PCIE标准使用专门的Message事务包来实现INTx中断,这是为了兼容以前的PCI软件。INTx是共享式的,cpu相应中断后还需要查询具体中断源,效率比较低
2)、MSIInterrupts: MSI发出usr_irq_req中断请求后,user_irq_ack为1只是说明中断已经北主机接收了,但是不代表已经处理,软件或者驱动层可以去清零usr_irq_req。 MSI中断和MSI-X都是往配置的CPU中断寄存器里进行memory写操作,来产生中断,效率比INTx是共享式高,其中MSI最多支持32个中断向量,而MSI-X最多支持2048个中断向量。
3)、MSI-XInterrupts: 当usr_irq_req中断请求后,只要user_irq_ack为1 就可以清零usr_irq_req,但是没说明扫码时候可以置1,重启下次中断。
经过以上所有中断方式测试,发目前使用Legacy和MSI已经够用,而且相对稳定,上位机驱动通过访问用户bar地址空间和米联客编写的Uixdmairqip-core一起管理接收的中断。
Uixdmairq.v源码 - /*******************************MILIANKE*******************************
- *Company : MiLianKe Electronic Technology Co., Ltd.
- *WebSite:https://www.milianke.com
- *TechWeb:https://www.uisrc.com
- *tmall-shop:https://milianke.tmall.com
- *jd-shop:https://milianke.jd.com
- *taobao-shop1: https://milianke.taobao.com
- *Create Date : 2022/05/01
- *Module Name:uixdmairq
- *File Name:uixdmairq.v
- *Description :
- *The reference demo provided by Milianke is only used for learning.
- *We cannot ensure that the demo itself is free of bugs, so users
- *should be responsible for the technical problems and consequences
- *caused by the use of their own products.
- *Copyright: Copyright (c) MiLianKe
- *All rights reserved.
- *Revision : 1.0
- *Signal description
- *1) _i input
- *2) _o output
- *3) _n activ low
- *4) _dg debug signal
- *5) _r delay or register
- *6) _s state mechine
- *********************************************************************/
- `timescale 1ns / 1ps
- module uixdmairq #
- (
- parameter integer XMDA_REQ_NUM = 8
- )
- (
- // Users to add ports here
- input wire [XMDA_REQ_NUM-1 :0] user_irq_req_i,
- //input wire [XMDA_REQ_NUM-1 :0] xdma_irq_ack_i,
- output wire [XMDA_REQ_NUM-1 :0] xdma_irq_req_o,
- output wire user_irq_en_o,
- input wire S_AXI_ACLK,
- input wire S_AXI_ARESETN,
- input wire [3 : 0] S_AXI_AWADDR,
- input wire [2 : 0] S_AXI_AWPROT,
- input wire S_AXI_AWVALID,
- output wire S_AXI_AWREADY,
- input wire [31 : 0] S_AXI_WDATA,
- input wire [3 : 0] S_AXI_WSTRB,
- input wire S_AXI_WVALID,
- output wire S_AXI_WREADY,
- output wire [1 : 0] S_AXI_BRESP,
- output wire S_AXI_BVALID,
- input wire S_AXI_BREADY,
- input wire [3 : 0] S_AXI_ARADDR,
- input wire [2 : 0] S_AXI_ARPROT,
- input wire S_AXI_ARVALID,
- output wire S_AXI_ARREADY,
- output wire [31 : 0] S_AXI_RDATA,
- output wire [1 : 0] S_AXI_RRESP,
- output wire S_AXI_RVALID,
- input wire S_AXI_RREADY
- );
- reg [XMDA_REQ_NUM -1 :0] user_irq_req;
- reg [XMDA_REQ_NUM -1 :0] user_irq_req_r1;
- reg [XMDA_REQ_NUM -1 :0] user_irq_req_r2;
- reg [XMDA_REQ_NUM -1 :0] user_irq_req_r3;
- reg [XMDA_REQ_NUM -1 :0] xdma_irq_ack_r1;
- reg [XMDA_REQ_NUM -1 :0] xdma_irq_ack_r2;
- reg [XMDA_REQ_NUM -1 :0] xdma_irq_ack_r3;
- //reg [XMDA_REQ_NUM -1 :0] xdma_irq_ack;
- reg [XMDA_REQ_NUM -1 :0] xdma_irq_req;
- // AXI4LITE signals
- reg [3:0] axi_awaddr;
- reg axi_awready;
- reg axi_wready;
- reg [1 : 0] axi_bresp;
- reg axi_bvalid;
- reg [3:0] axi_araddr;
- reg axi_arready;
- reg [31 : 0] axi_rdata;
- reg [1 : 0] axi_rresp;
- reg axi_rvalid;
- // Example-specific design signals
- // local parameter for addressing 32 bit / 64 bit C S AXI_DATA_WIDTH
- // ADDR_LSB is used for addressing 32/64 bit registers/memories
- // ADDR_LSB = 2 for 32 bits (n downto 2)
- // ADDR_LSB = 3 for 64 bits (n downto 3)
- localparam integer ADDR_LSB = 2;
- localparam integer OPT_MEM_ADDR_BITS = 1;
- //----------------------------------------------
- //-- Signals for user logic register space example
- //------------------------------------------------
- //-- Number of Slave Registers 4
- reg [31:0] slv_reg0;
- reg [31:0] slv_reg1;
- wire slv_reg_rden;
- wire slv_reg_wren;
- reg [31:0] reg_data_out;
- integer byte_index;
- reg aw_en;
- // I/O Connections assignments
- assign S_AXI_AWREADY = axi_awready;
- assign S_AXI_WREADY = axi_wready;
- assign S_AXI_BRESP = axi_bresp;
- assign S_AXI_BVALID = axi_bvalid;
- assign S_AXI_ARREADY = axi_arready;
- assign S_AXI_RDATA = axi_rdata;
- assign S_AXI_RRESP = axi_rresp;
- assign S_AXI_RVALID = axi_rvalid;
- // Implement axi_awready generation
- // axi_awready is asserted for one S_AXI_ACLK clock cycle when both
- // S_AXI_AWVALID and S_AXI_WVALID are asserted. axi_awready is
- // de-asserted when reset is low.
- always @( posedge S_AXI_ACLK )
- begin
- if ( S_AXI_ARESETN == 1 'b0 )
- begin
- axi_awready <= 1 'b0;
- aw_en <= 1 'b1;
- end
- else
- begin
- if (~axi_awready && S_AXI_AWVALID && S_AXI_WVALID && aw_en)
- begin
- // slave is ready to accept write address when
- // there is a valid write address and write data
- // on the write address and data bus. This design
- // expects no outstanding transactions.
- axi_awready <= 1 'b1;
- aw_en <= 1 'b0;
- end
- else if (S_AXI_BREADY && axi_bvalid)
- begin
- aw_en <= 1 'b1;
- axi_awready <= 1 'b0;
- end
- else
- begin
- axi_awready <= 1 'b0;
- end
- end
- end
- // Implement axi_awaddr latching
- // This process is used to latch the address when both
- // S_AXI_AWVALID and S_AXI_WVALID are valid.
- always @( posedge S_AXI_ACLK )
- begin
- if ( S_AXI_ARESETN == 1 'b0 )
- begin
- axi_awaddr <= 0;
- end
- else
- begin
- if (~axi_awready && S_AXI_AWVALID && S_AXI_WVALID && aw_en)
- begin
- // Write Address latching
- axi_awaddr <= S_AXI_AWADDR;
- end
- end
- end
- // Implement axi_wready generation
- // axi_wready is asserted for one S_AXI_ACLK clock cycle when both
- // S_AXI_AWVALID and S_AXI_WVALID are asserted. axi_wready is
- // de-asserted when reset is low.
- always @( posedge S_AXI_ACLK )
- begin
- if ( S_AXI_ARESETN == 1 'b0 )
- begin
- axi_wready <= 1 'b0;
- end
- else
- begin
- if (~axi_wready && S_AXI_WVALID && S_AXI_AWVALID && aw_en )
- begin
- // slave is ready to accept write data when
- // there is a valid write address and write data
- // on the write address and data bus. This design
- // expects no outstanding transactions.
- axi_wready <= 1 'b1;
- end
- else
- begin
- axi_wready <= 1 'b0;
- end
- end
- end
- assign slv_reg_wren = axi_wready && S_AXI_WVALID && axi_awready && S_AXI_AWVALID;
- always @( posedge S_AXI_ACLK )
- begin
- if ( S_AXI_ARESETN == 1 'b0 )begin
- slv_reg0 <= 0;
- end
- else if(slv_reg_wren)begin
- if (axi_awaddr[3:2] == 2 'd0)
- slv_reg0[31:0] <= S_AXI_WDATA[31:0];
- else if (axi_awaddr[3:2] == 2 'd1)
- slv_reg1[31:0] <= S_AXI_WDATA[31:0];
- end else begin
- slv_reg0 <= 0;
- slv_reg1 <= slv_reg1;
- end
- end
- // Implement write response logic generation
- // The write response and response valid signals are asserted by the slave
- // when axi_wready, S_AXI_WVALID, axi_wready and S_AXI_WVALID are asserted.
- // This marks the acceptance of address and indicates the status of
- // write transaction.
- always @( posedge S_AXI_ACLK )
- begin
- if ( S_AXI_ARESETN == 1 'b0 )
- begin
- axi_bvalid<=0;
- axi_bresp<=2'b0;
- end
- else
- begin
- if (axi_awready && S_AXI_AWVALID && ~axi_bvalid && axi_wready && S_AXI_WVALID)
- begin
- // indicates a valid write response is available
- axi_bvalid <= 1 'b1;
- axi_bresp <= 2 'b0; // 'OKAY ' response
- end // work error responses in future
- else
- begin
- if (S_AXI_BREADY && axi_bvalid)
- //check if bready is asserted while bvalid is high)
- //(there is a possibility that bready is always asserted high)
- begin
- axi_bvalid <= 1 'b0;
- end
- end
- end
- end
- // Implement axi_arready generation
- // axi_arready is asserted for one S_AXI_ACLK clock cycle when
- // S_AXI_ARVALID is asserted. axi_awready is
- // de-asserted when reset (active low) is asserted.
- // The read address is also latched when S_AXI_ARVALID is
- // asserted. axi_araddr is reset to zero on reset assertion.
- always @( posedge S_AXI_ACLK )
- begin
- if ( S_AXI_ARESETN == 1 'b0 )
- begin
- axi_arready <= 1 'b0;
- axi_araddr <= 32 'b0;
- end
- else
- begin
- if (~axi_arready && S_AXI_ARVALID)
- begin
- // indicates that the slave has acceped the valid read address
- axi_arready <= 1 'b1;
- // Read address latching
- axi_araddr <= S_AXI_ARADDR;
- end
- else
- begin
- axi_arready <= 1 'b0;
- end
- end
- end
- // Implement axi_arvalid generation
- // axi_rvalid is asserted for one S_AXI_ACLK clock cycle when both
- // S_AXI_ARVALID and axi_arready are asserted. The slave registers
- // data are available on the axi_rdata bus at this instance. The
- // assertion of axi_rvalid marks the validity of read data on the
- // bus and axi_rresp indicates the status of read transaction.axi_rvalid
- // is deasserted on reset (active low). axi_rresp and axi_rdata are
- // cleared to zero on reset (active low).
- always @( posedge S_AXI_ACLK )
- begin
- if ( S_AXI_ARESETN == 1 'b0 )
- begin
- axi_rvalid <= 0;
- axi_rresp <= 0;
- end
- else
- begin
- if (axi_arready && S_AXI_ARVALID && ~axi_rvalid)
- begin
- // Valid read data is available at the read data bus
- axi_rvalid <= 1 'b1;
- axi_rresp <= 2 'b0; // 'OKAY ' response
- end
- else if (axi_rvalid && S_AXI_RREADY)
- begin
- // Read data is accepted by the master
- axi_rvalid <= 1 'b0;
- end
- end
- end
- // Implement memory mapped register select and read logic generation
- // Slave register read enable is asserted when valid address is available
- // and the slave is ready to accept the read address.
- assign slv_reg_rden = axi_arready & S_AXI_ARVALID & ~axi_rvalid;
- always @(*)
- if(axi_awaddr[3:2] == 2 'd0)
- reg_data_out[31 : 0] <= slv_reg0;
- else if(axi_awaddr[3:2] == 2 'd1)
- reg_data_out[31 : 0] <= slv_reg1;
- else
- reg_data_out <= reg_data_out;
- // Output register or memory read data
- always @( posedge S_AXI_ACLK )
- begin
- if ( S_AXI_ARESETN == 1 'b0 )
- begin
- axi_rdata <= 0;
- end
- else
- begin
- // When there is a valid read address (S_AXI_ARVALID) with
- // acceptance of read address by the slave (axi_arready),
- // output the read dada
- if (slv_reg_rden) begin
- axi_rdata <= reg_data_out; // register read data
- end
- end
- end
- // Add user logic here
- reg [4 :0] i;
- reg [4 :0] j;
- reg [4 :0] k;
- assign xdma_irq_req_o = xdma_irq_req;
- assign user_irq_en_o = slv_reg1[31];
- wire [XMDA_REQ_NUM-1:0] xdma_irq_ack = slv_reg0[XMDA_REQ_NUM-1:0];
- always @( posedge S_AXI_ACLK ) begin
- if ( S_AXI_ARESETN == 1 'b0 || user_irq_en_o == 1 'b0 )begin
- user_irq_req_r1 = 0;
- user_irq_req_r2 = 0;
- user_irq_req_r3 = 0;
- end
- else begin
- user_irq_req_r1 <= user_irq_req_i;
- user_irq_req_r2 <= user_irq_req_r1;
- user_irq_req_r3 <= user_irq_req_r2;
- end
- end
- always @( posedge S_AXI_ACLK ) begin
- if ( S_AXI_ARESETN == 1 'b0 || user_irq_en_o == 1 'b0)begin
- j <= 5 'd0;
- user_irq_req <= 0;
- end
- else begin
- for(j = 0; j <=XMDA_REQ_NUM-1; j = j +1 )
- user_irq_req[j] <= !user_irq_req_r3[j] & user_irq_req_r2[j];
- end
- end
- always @( posedge S_AXI_ACLK ) begin
- if ( S_AXI_ARESETN == 1 'b0 || user_irq_en_o == 1 'b0)begin
- i <= 5 'd0;
- xdma_irq_req <= 0;
- end
- else begin
- for(i = 0; i <=XMDA_REQ_NUM-1; i = i +1 )begin
- if(xdma_irq_ack[i]) begin
- xdma_irq_req[i] <= 1 'b0;
- end
- else if(user_irq_req[i])begin
- xdma_irq_req[i] <= 1 'b1;
- end
- end
- end
- end
- // User logic ends
- endmodule
复制代码- wireaxi_aclk;
- wireaxi_aresetn;
- wireuser_irq_en_o;
- reg [21:0]timer_cnt;
- always@(posedgeaxi_aclk)begin
- if(!axi_aresetn||!user_irq_en_o)begin
- timer_cnt<=22'd0;
- end
- elsebegin
- timer_cnt<=timer_cnt+1'b1;
- end
- end
- regtimer_r1,timer_r2;
- wireinter=!timer_r2&&timer_r1;
- always@(posedgeaxi_aclk)begin
- if(!axi_aresetn||!user_irq_en_o)begin
- timer_r1<=1'd0;
- timer_r2<=1'd0;
- end
- elsebegin
- timer_r1<=timer_cnt[20];
- timer_r2<=timer_r1;
- end
- end
- reg[1:0]int_p;
- reg[3:0]user_irq_req_i;
- always@(posedgeaxi_aclk)begin
- if(!axi_aresetn||!user_irq_en_o)begin
- int_p[1:0] <=4'd0;
- user_irq_req_i<=4'd0;
- end
- elsebegin
- if(inter) int_p<=int_p+1'b1;
- user_irq_req_i<=4'd0;
- user_irq_req_i[int_p]<=1'b1;
- end
- end
复制代码
2:XDMA中断上位机部分代码
实现中断程序的源码intr_event.c: - #include<Windows.h>
- #include<assert.h>
- #include<stdlib.h>
- #include<stdio.h>
- #include<strsafe.h>
- #include<stdint.h>
- #include<SetupAPI.h>
- #include<INITGUID.H>
- #include<WinIoCtl.h>
- //#include<AtlBase.h>
- #include<io.h>
- #include"xdma_public.h"
- #pragmacomment(lib,"setupapi.lib")
- #pragmawarning(disable:4996)
- BYTE start_en;
- HANDLEh_user;
- DWORD user_irq_ack[1];
- char base_path[MAX_PATH+1]="";
- #defineMAX_BYTES_PER_TRANSFER0x800000
- staticintverbose_msg(constchar*constfmt,...){
- intret=0;
- va_listargs;
- if(1){
- va_start(args,fmt);
- ret=vprintf(fmt,args);
- va_end(args);
- }
- returnret;
- }
- staticBYTE*allocate_buffer(size_tsize,size_talignment){
- if(size==0){
- size=4;
- }
- if(alignment==0){
- SYSTEM_INFOsys_info;
- GetSystemInfo(&sys_info);
- alignment=sys_info.dwPageSize;
- //printf("alignment=%d\n",alignment);
- }
- verbose_msg("Allocatinghost-sidebufferofsize%d,alignedto%dbytes\n",size,alignment);
- return(BYTE*)_aligned_malloc(size,alignment);
- }
- staticintget_devices(GUIDguid,char*devpath,size_tlen_devpath){
- SP_DEVICE_INTERFACE_DATAdevice_interface;
- PSP_DEVICE_INTERFACE_DETAIL_DATAdev_detail;
- DWORDindex;
- HDEVINFOdevice_info;
- wchar_ttmp[256];
- device_info=SetupDiGetClassDevs((LPGUID)&guid,NULL,NULL,DIGCF_PRESENT|DIGCF_DEVICEINTERFACE);
- if(device_info==INVALID_HANDLE_VALUE){
- fprintf(stderr,"GetDevicesINVALID_HANDLE_VALUE\n");
- exit(-1);
- }
- device_interface.cbSize=sizeof(SP_DEVICE_INTERFACE_DATA);
- //enumeratethroughdevices
- for(index=0;SetupDiEnumDeviceInterfaces(device_info,NULL,&guid,index,&device_interface);++index){
- //getrequiredbuffersize
- ULONGdetailLength=0;
- if(!SetupDiGetDeviceInterfaceDetail(device_info,&device_interface,NULL,0,&detailLength,NULL)&&GetLastError()!=ERROR_INSUFFICIENT_BUFFER){
- fprintf(stderr,"SetupDiGetDeviceInterfaceDetail-getlengthfailed\n");
- break;
- }
- //allocatespacefordeviceinterfacedetail
- dev_detail=(PSP_DEVICE_INTERFACE_DETAIL_DATA)HeapAlloc(GetProcessHeap(),HEAP_ZERO_MEMORY,detailLength);
- if(!dev_detail){
- fprintf(stderr,"HeapAllocfailed\n");
- break;
- }
- dev_detail->cbSize=sizeof(SP_DEVICE_INTERFACE_DETAIL_DATA);
- //getdeviceinterfacedetail
- if(!SetupDiGetDeviceInterfaceDetail(device_info,&device_interface,dev_detail,detailLength,NULL,NULL)){
- fprintf(stderr,"SetupDiGetDeviceInterfaceDetail-getdetailfailed\n");
- HeapFree(GetProcessHeap(),0,dev_detail);
- break;
- }
- StringCchCopy(tmp,len_devpath,dev_detail->DevicePath);
- wcstombs(devpath,tmp,256);
- HeapFree(GetProcessHeap(),0,dev_detail);
- }
- SetupDiDestroyDeviceInfoList(device_info);
- returnindex;
- }
- HANDLEopen_devices(char*device_base_path,char*device_name,DWORDaccessFlags)
- {
- chardevice_path[MAX_PATH+1]="";
- wchar_tdevice_path_w[MAX_PATH+1];
- HANDLEh;
- //extenddevicepathtoincludetargetdevicenode(xdma_control,xdma_useretc)
- verbose_msg("Devicebasepath:%s\n",device_base_path);
- strcpy_s(device_path,sizeofdevice_path,device_base_path);
- strcat_s(device_path,sizeofdevice_path,device_name);
- verbose_msg("Devicenode:%s\n",device_name);
- //opendevicefile
- mbstowcs(device_path_w,device_path,sizeof(device_path));
- h=CreateFile(device_path_w,accessFlags,0,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,NULL);
- if(h==INVALID_HANDLE_VALUE)
- {
- fprintf(stderr,"Erroropeningdevice,win32errorcode:%ld\n",GetLastError());
- }
- returnh;
- }
- staticintread_device(HANDLEdevice,longaddress,DWORDsize,BYTE*buffer)
- {
- DWORDrd_size=0;
- unsignedinttransfers;
- unsignedinti;
- if(INVALID_SET_FILE_POINTER==SetFilePointer(device,address,NULL,FILE_BEGIN)){
- fprintf(stderr,"Errorsettingfilepointer,win32errorcode:%ld\n",GetLastError());
- return-3;
- }
- transfers=(unsignedint)(size/MAX_BYTES_PER_TRANSFER);
- for(i=0;i<transfers;i++)
- {
- if(!ReadFile(device,(void*)(buffer+i*MAX_BYTES_PER_TRANSFER),(DWORD)MAX_BYTES_PER_TRANSFER,&rd_size,NULL))
- {
- return-1;
- }
- if(rd_size!=MAX_BYTES_PER_TRANSFER)
- {
- return-2;
- }
- }
- if(!ReadFile(device,(void*)(buffer+i*MAX_BYTES_PER_TRANSFER),(DWORD)(size-i*MAX_BYTES_PER_TRANSFER),&rd_size,NULL))
- {
- return-1;
- }
- if(rd_size!=(size-i*MAX_BYTES_PER_TRANSFER))
- {
- return-2;
- }
- returnsize;
- }
- staticintwrite_device(HANDLEdevice,longaddress,DWORDsize,BYTE*buffer)
- {
- DWORDwr_size=0;
- unsignedinttransfers;
- unsignedinti;
- transfers=(unsignedint)(size/MAX_BYTES_PER_TRANSFER);
- //printf("transfers=%d\n",transfers);
- if(INVALID_SET_FILE_POINTER==SetFilePointer(device,address,NULL,FILE_BEGIN)){
- fprintf(stderr,"Errorsettingfilepointer,win32errorcode:%ld\n",GetLastError());
- return-3;
- }
- for(i=0;i<transfers;i++)
- {
- if(!WriteFile(device,(void*)(buffer+i*MAX_BYTES_PER_TRANSFER),MAX_BYTES_PER_TRANSFER,&wr_size,NULL))
- {
- return-1;
- }
- if(wr_size!=MAX_BYTES_PER_TRANSFER)
- {
- return-2;
- }
- }
- if(!WriteFile(device,(void*)(buffer+i*MAX_BYTES_PER_TRANSFER),(DWORD)(size-i*MAX_BYTES_PER_TRANSFER),&wr_size,NULL))
- {
- return-1;
- }
- if(wr_size!=(size-i*MAX_BYTES_PER_TRANSFER))
- {
- return-2;
- }
- returnsize;
- }
- DWORDWINAPIthread_event0(LPVOIDlpParam)
- {
- BYTEval0[1]="";
- DWORDi=0;
- BYTEstatu;
- char*device_name1="\\event_0";
- HANDLEh_event0=open_devices(base_path,device_name1,GENERIC_READ);
- while(1)
- {
- if(start_en){
- read_device(h_event0,0,1,val0);//waiteirq
- Sleep(1);
- if(val0[0]==1)
- printf("event_0done!\n");
- else
- printf("event_0timeout!\n");
- i++;
- }
- }
- CloseHandle(h_event0);
- return0;
- }
- DWORDWINAPIthread_event1(LPVOIDlpParam)
- {
- BYTEval0[1]="";
- DWORDi=0;
- BYTEstatu;
- char*device_name1="\\event_1";
- HANDLEh_event1=open_devices(base_path,device_name1,GENERIC_READ);
- while(1)
- {
- if(start_en){
- read_device(h_event1,0,1,val0);//waiteirq
- Sleep(1);
- if(val0[0]==1)
- printf("event_1done!\n");
- else
- printf("event_1timeout!\n");
- i++;
- }
- }
- CloseHandle(h_event1);
- return0;
- }
- DWORDWINAPIthread_event2(LPVOIDlpParam)
- {
- BYTEval0[1]="";
- DWORDi=0;
- BYTEstatu;
- char*device_name1="\\event_2";
- HANDLEh_event2=open_devices(base_path,device_name1,GENERIC_READ);
- while(1)
- {
- if(start_en){
- read_device(h_event2,0,1,val0);//waiteirq
- Sleep(1);
- if(val0[0]==1)
- printf("event_2done!\n");
- else
- printf("event_2timeout!\n");
- i++;
- }
- }
- CloseHandle(h_event2);
- return0;
- }
- DWORDWINAPIthread_event3(LPVOIDlpParam)
- {
- BYTEval0[1]="";
- DWORDi=0;
- BYTEstatu;
- char*device_name1="\\event_3";
- HANDLEh_event3=open_devices(base_path,device_name1,GENERIC_READ);
- while(1)
- {
- if(start_en){
- read_device(h_event3,0,1,val0);//waiteirq
- Sleep(1);
- if(val0[0]==1)
- printf("event_3done!\n");
- else
- printf("event_3timeout!\n");
- i++;
- }
- }
- CloseHandle(h_event3);
- return0;
- }
- int__cdeclmain(intargc,char*argv[])
- {
- HANDLEh_event0;
- HANDLEh_event1;
- HANDLEh_event2;
- HANDLEh_event3;
- HANDLEh_event4;
- HANDLEh_event5;
- HANDLEh_event6;
- HANDLEh_event7;
- char*device_name="\\user";
- DWORDnum_devices=get_devices(GUID_DEVINTERFACE_XDMA,base_path,sizeof(base_path));
- verbose_msg("Devicesfound:%d\n",num_devices);
- if(num_devices<1)
- {
- printf("error\n");
- }
- h_user=open_devices(base_path,device_name,GENERIC_READ|GENERIC_WRITE);
- if(h_user==INVALID_HANDLE_VALUE)
- {
- fprintf(stderr,"Erroropeningdevice,win32errorcode:%ld\n",GetLastError());
- }
- h_event0=CreateThread(NULL,0,thread_event0,NULL,0,NULL);
- h_event1=CreateThread(NULL,0,thread_event1,NULL,0,NULL);
- h_event2=CreateThread(NULL,0,thread_event2,NULL,0,NULL);
- h_event3=CreateThread(NULL,0,thread_event3,NULL,0,NULL);
- user_irq_ack[0]=0xffff0000;
- write_device(h_user,0x00004,4,(BYTE*)user_irq_ack);//startirq
- start_en=1;
- printf("start\n");
- WaitForSingleObject(h_event0,INFINITE);
- WaitForSingleObject(h_event1,INFINITE);
- WaitForSingleObject(h_event2,INFINITE);
- WaitForSingleObject(h_event3,INFINITE);
- user_irq_ack[0]=0x00000000;
- write_device(h_user,0x00004,4,(BYTE*)user_irq_ack);//stop irq
- CloseHandle(h_user);
- CloseHandle(h_event0);
- CloseHandle(h_event1);
- CloseHandle(h_event2);
- CloseHandle(h_event3);
- return0;
- }
复制代码以下指令启动中断 user_irq_ack[0]=0xffff0000; write_device(h_user,0x00004,4,(BYTE*)user_irq_ack); 以下指令关闭中断 user_irq_ack[0]=0x00000000; write_device(h_user,0x00004,4,(BYTE*)user_irq_ack); 以上程序设置了4个中断事件,每个事件开启了一个线程,当中断等待的时候线程是挂起的,当中断产生后,继续执行线程。对于XDMA最大可以支持32个中断 3:XDMA中断测试
执行xdma_event.exe程序
可以看到运行结果是4个中断事件,实际上XMDA最大支持16个中断事件。更多的中断时间可以更好的发挥CPU多核多线程的性能。 看FPGA抓的波形信号
好了,到此XDMA的PCIE方案核心内容就已经讲完了。XILINX官方给的资料往往没有细化,我们米联客已经对以上驱动稍加修改,以更好地支持中断。
|