Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
APPARATUS AND METHOD FOR LOGICAL CHANNEL PRIORITIZATION (LCP) PROCESSING OF HIGH-DENSITY, HIGH-PRIORITY SMALL PACKETS
Document Type and Number:
WIPO Patent Application WO/2024/063785
Kind Code:
A1
Abstract:
According to one aspect of the present disclosure, a baseband chip is provided. The baseband chip may include UL DP hardware. The baseband chip may include custom memory (CM). The CM may receive, in a logical channel queue (LCQ), a plurality of high-priority small packets (HPS) from an external memory. The CM may send a DMA complete message to HPS custom instructions (CX) located in an uplink (UL) logical channel prioritization (LCP) microcontroller (uC) once the plurality of HPS are received. The HPS CX may perform HPS pre¬ processing of the plurality of HPS located in the first HPS LCQ to generate a set of Layer 2 commands for use by UL data plane hardware DP hardware to generate at least one media access control (MAC) sub-protocol data unit (SubPDU). The Layer 2 commands may be generated prior to the UL LCP uC receiving a UL grant.

Inventors:
LOW SU-LIN (US)
LEE CHUN-I (US)
CHEN NA (US)
MA TIANAN TIM (US)
Application Number:
PCT/US2022/044569
Publication Date:
March 28, 2024
Filing Date:
September 23, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ZEKU INC (US)
International Classes:
H04L69/324; G06F13/00; H04L69/04; H04W80/02
Domestic Patent References:
WO2021152369A12021-08-05
Foreign References:
JP2018152638A2018-09-27
US20190274143A12019-09-05
US20090031325A12009-01-29
Attorney, Agent or Firm:
ZOU, Zhiwei (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A baseband chip, comprising: custom memory (CM) configured to: receive, in a first high-density, high-priority small-packet (HPS) logical channel queue (LCQ), a plurality of HPS from an external memory; and send a direct memory access (DMA) complete message to HPS custom instructions (CX) once the plurality of HPS are received, the HPS CX being located in an uplink (UL) logical channel prioritization (LCP) microcontroller (uC) that comprises a co-processor or a set of hardware threads to which the HPS CX are coupled, a Layer 2 data plane (DP) subsystem comprising: uplink (UL) DP hardware; and the UL LCP uC comprising the co-processor or a set of hardware threads, the HPS CX coupled to the co-processor or the set of hardware threads, and an LCP CX, wherein the HPS CX are configured to: in response to receiving the DMA complete message, perform HPS pre-processing of the plurality of HPS located in the first HPS LCQ to generate a first set of Layer 2 commands for use by the UL DP hardware, wherein the LCP CX are configured to: receive a UL grant from a physical layer (PHY) receiver; and in response to receiving the UL grant from the PHY receiver, perform LCP processing of non-HPS packets to generate a second set of Layer 2 commands for use by the UL DP hardware, and wherein the first set of Layer 2 commands are generated prior to the UL grant being received by the LCP CX.

2. The baseband chip of claim 1, wherein, to perform the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM, the HPS CX are configured to: perform robust header compression (ROHC) for each HPS of the plurality of HPS; generate service data adaptation protocol (SDAP), packet data convergence protocol (PDCP), radio link control (RLC), and media access control (MAC) headers for each HPS of the plurality of HPS; and construct Layer 2 command packet descriptors for each HPS of the plurality of HPS.

3. The baseband chip of claim 1, further comprising: an external memory configured to: receive, in a second HPS LCQ, the plurality of HPS from a Layer 3 subsystem; and in response to a number of HPS in the second HPS LCQ meeting a threshold number, autonomously page-in the plurality of HPS into the first HPS LCQ of the CM.

4. The baseband chip of claim 3, wherein the plurality of HPS are autonomously paged into the first HPS LCQ of the CM by DMA.

5. The baseband chip of claim 1, wherein the LCP CX are configured to: perform LCP processing of one or more of control packets, retransmission packets, or normal -priority packets residing in different LCQs of the CM.

6. The baseband chip of claim 5, wherein the LCP CX are configured to: in response to a completion of the LCP processing, enqueue the second set of Layer 2 commands into a Layer 2 command queue of the CM.

7. The baseband chip of claim 6, wherein the UL DP hardware is configured to: receive the first set of Layer 2 commands and the second set of Layer 2 commands; generate a plurality of media access control sub-protocol data units (MAC SubPDUs) based on the first set of Layer 2 commands and the second set of Layer 2 commands; and send the MAC SubPDUs to a physical layer (PHY) transmitter (TX) for transmission to a base station.

8. An apparatus for wireless communication of a user equipment (UE), comprising: a baseband chip comprising: custom memory (CM) configured to: receive, in a first high-density, high-priority small-packet (HPS) logical channel queue (LCQ), a plurality of HPS from an external memory; and send a direct memory access (DMA) complete message to HPS custom instructions (CX) once the plurality of HPS are received, the HPS CX being located in an uplink (UL) logical channel prioritization (LCP) microcontroller (uC) that comprises a co-processor or a set of hardware threads to which the HPS CX are coupled, a Layer 2 data plane (DP) subsystem comprising: uplink (UL) DP hardware; and the UL LCP uC comprising the co-processor or a set of hardware threads, the HPS CX coupled to the co-processor or the set of hardware threads, and an LCP CX, wherein the HPS CX are configured to: in response to receiving the DMA complete message, perform HPS pre-processing of the plurality of HPS located in the first HPS LCQ to generate a first set of Layer 2 commands for use by the UL DP hardware, wherein the LCP CX are configured to: receive a UL grant from a physical layer (PHY) receiver; and in response to receiving the UL grant from the PHY receiver, perform LCP processing of non-HPS packets to generate a second set of Layer 2 commands for use by the UL DP hardware, and wherein the first set of Layer 2 commands are generated prior to the UL grant being received by the LCP CX.

9. The apparatus of claim 8, wherein, to perform the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM, the HPS CX are configured to: perform robust header compression (ROHC) for each HPS of the plurality of HPS; generate service data adaptation protocol (SDAP), packet data convergence protocol (PDCP), radio link control (RLC), and media access control (MAC) headers for each HPS of the plurality of HPS; and construct Layer 2 command packet descriptors for each HPS of the plurality of HPS.

10. The apparatus of claim 8, further comprising: an external memory configured to: receive, in a second HPS LCQ, the plurality of HPS from a Layer 3 subsystem; and in response to a number of HPS in the second HPS LCQ meeting a threshold number, autonomously page-in the plurality of HPS into the first HPS LCQ of the CM.

11. The apparatus of claim 10, wherein the plurality of HPS are autonomously paged into the first HPS LCQ of the CM by DMA.

12. The apparatus of claim 8, wherein the LCP CX are configured to: perform LCP processing of one or more of control packets, retransmission packets, or normal -priority packets residing in different LCQs of the CM.

13. The apparatus of claim 12, wherein the LCP CX are configured to: in response to a completion of the LCP processing, enqueue the second set of Layer 2 commands into a Layer 2 command queue of the CM.

14. The apparatus of claim 13, wherein the UL DP hardware is configured to: receive the first set of Layer 2 commands and the second set of Layer 2 commands; generate a plurality of media access control sub-protocol data units (MAC SubPDUs) based on the first set of Layer 2 commands and the second set of Layer 2 commands; and send the MAC SubPDUs to a physical layer (PHY) transmitter (TX) for transmission to a base station.

15. A method of wireless communication of a baseband chip, comprising: receiving, in a first HPS logical channel queue (LCQ) of custom memory (CM) coupled to an uplink (UL) logical channel prioritization (LCP) uC, a plurality of high-priority small packets (HPS) from an external memory; sending, by the CM, a direct memory access (DMA) complete message to HPS CX of the UL LCP uC once the plurality of HPS are received; in response to receiving the DMA complete message, performing, by the HPS CX of the UL LCP uC, HPS pre-processing of the plurality of HPS located in the first HPS LCQ to generate a first set of Layer 2 commands for use by UL data plane DP hardware; receiving, by LCP custom instructions (CX) of the UL LCP uC, a UL grant from a physical layer (PHY) receiver; and in response to receiving the UL grant from the PHY receiver, performing, by the LCP CX of the UL LCP uC, LCP processing of non-HPS packets to generate a second set of Layer 2 commands for use by the UL DP hardware, wherein the first set of Layer 2 commands are generated prior to the UL grant being received by the LCP CX.

16. The method of claim 15, wherein the performing, by the HPS CX, the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM comprises: performing robust header compression (ROHC) for each HPS of the plurality of HPS; generating service data adaptation protocol (SDAP), packet data convergence protocol (PDCP), radio link control (RLC), and media access control (MAC) headers for each HPS of the plurality of HPS; and constructing Layer 2 command packet descriptors for each HPS of the plurality of HPS.

17. The method of claim 15, further comprising: receiving, in a second HPS LCQ of an external memory, the plurality of HPS from a Layer 3 subsystem; and in response to a number of HPS in the second HPS LCQ meeting a threshold number, autonomously paging-in, by the external memory, the plurality of HPS into the first HPS LCQ of the CM.

18. The method of claim 17, wherein the plurality of HPS are autonomously paged into the first HPS LCQ of the CM by DMA.

19. The method of claim 15, further comprising: in response to a completion of the LCP processing, enqueuing, by the LCP CX of the UL LCP uC, the second set of Layer 2 commands into a Layer 2 command queue of the CM.

20. The method of claim 19, further comprising: receiving, by the UL DP hardware, the first set of Layer 2 commands and the second set of Layer 2 commands; generating, by the UL DP hardware, a plurality of media access control sub-protocol data units (MAC SubPDUs) based on the first set of Layer 2 commands and the second set of Layer 2 commands; and sending, by the UL DP hardware, the MAC SubPDUs to a physical layer (PHY) transmitter (TX) for transmission to a base station.

Description:
APPARATUS AND METHOD FOR LOGICAL CHANNEL PRIORITIZATION (LCP) PROCESSING OF HIGH-DENSITY, HIGH- PRIORITY SMALL PACKETS

BACKGROUND

[0001] Embodiments of the present disclosure relate to apparatus and method for wireless communication.

[0002] Wireless communication systems are widely deployed to provide various telecommunication services such as telephony, video, data, messaging, and broadcasts. In cellular communication, such as the 4th-gen eration (4G) Long Term Evolution (LTE) and the 5th- generation (5G) New Radio (NR), the 3rd Generation Partnership Project (3GPP) defines various procedures for uplink (UL) Layer 2 data processing.

SUMMARY

[0003] According to one aspect of the present disclosure, a baseband chip is provided. The baseband chip may include custom memory (CM). The CM may be configured to receive, in a first high-density, high-priority small-packet (HPS) logical channel queue (LCQ), a plurality of HPS from an external memory. The CM may be configured to send a direct memory access (DMA) complete message to HPS CX once the plurality of HPS are received. The HPS CX may be located in a UL logical channel prioritization (LCP) microcontroller (uC) that includes a co-processor or a set of hardware threads to which the HPS CX are coupled. The baseband chip may include a Layer 2 data plane (DP) subsystem. The Layer 2 DP subsystem may include UL DP hardware. The Layer 2 DP subsystem may include the UL LCP uC. The UL LCP uC may include the co-processor or a set of hardware threads, the HPS CX coupled to the co-processor or the set of hardware threads, and an LCP CX. The HPS CX may be configured to, in response to receiving the DMA complete message, perform HPS pre-processing of the plurality of HPS located in the first HPS LCQ to generate a first set of Layer 2 commands for use by the UL DP hardware. The LCP CX may be configured to receive a UL grant from a PHY receiver. The LCP CX may be configured to, in response to receiving the UL grant from the PHY receiver, perform LCP processing of non-HPS packets to generate a second set of Layer 2 commands for use by the UL DP hardware. The first set of Layer 2 commands may be generated prior to the UL grant being received by the LCP CX.

[0004] According to another aspect of the present disclosure, an apparatus for wireless communication of a user equipment (UE) is provided. The UE may include a baseband chip. The baseband chip may include CM. The CM may be configured to receive, in a first HPS LCQ, a plurality of HPS from an external memory. The CM may be configured to send a DMA complete message to HPS CX once the plurality of HPS are received. The HPS CX may be located in a UL LCP uC that includes a co-processor or a set of hardware threads to which the HPS CX are coupled. The baseband chip may include a Layer 2 DP subsystem. The Layer 2 DP subsystem may include UL DP hardware. The Layer 2 DP subsystem may include the UL LCP uC. The UL LCP uC may include the co-processor or a set of hardware threads, the HPS CX coupled to the co-processor or the set of hardware threads, and an LCP CX. The HPS CX may be configured to, in response to receiving the DMA complete message, perform HPS pre-processing of the plurality of HPS located in the first HPS LCQ to generate a first set of Layer 2 commands for use by the UL DP hardware. The LCP CX may be configured to receive a UL grant from a PHY receiver. The LCP CX may be configured to, in response to receiving the UL grant from the PHY receiver, perform LCP processing of non-HPS packets to generate a second set of Layer 2 commands for use by the UL DP hardware. The first set of Layer 2 commands may be generated prior to the UL grant being received by the LCP CX.

[0005] According to yet another aspect of the present disclosure, a method of wireless communication of a baseband chip is provided. The method may include receiving, in a first HPS LCQ of CM coupled to a UL uC, a plurality of HPS from an external memory. The method may include sending, by the CM, a DMA complete message to HPS CX of the UL LCP uC once the plurality of HPS are received. In response to receiving the DMA complete message, the method may include performing, by the HPS CX of the UL LCP uC, HPS pre-processing of the plurality of HPS located in the first HPS LCQ to generate a first set of Layer 2 commands for use by UL data plane DP hardware. The method may include receiving, by LCP CX of the UL LCP uC, a UL grant from a PHY receiver. In response to receiving the UL grant from the PHY receiver, the method may include performing, by the LCP CX of the UL LCP uC, LCP processing of non-HPS packets to generate a second set of Layer 2 commands for use by the UL DP hardware. The first set of Layer 2 commands may be generated prior to the UL grant being received by the LCP CX. [0006] These illustrative embodiments are mentioned not to limit or define the present disclosure, but to provide examples to aid understanding thereof. Additional embodiments are discussed in the Detailed Description, and further description is provided there.

BRIEF DESCRIPTION OF THE DRAWINGS [0007] The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments of the present disclosure and, together with the description, further serve to explain the principles of the present disclosure and to enable a person skilled in the pertinent art to make and use the present disclosure.

[0008] FIG. 1 A illustrates a first example timing diagram used to schedule a UL medium access control (MAC) protocol data unit (PDU).

[0009] FIG. IB illustrates a second example timing diagram that may be used to schedule a MAC PDU.

[0010] FIG. 2 illustrates an exemplary wireless network, according to some embodiments of the present disclosure.

[0011] FIG. 3 illustrates a block diagram of an exemplary node, according to some embodiments of the present disclosure.

[0012] FIG. 4 illustrates a block diagram of an exemplary baseband chip, according to some embodiments of the present disclosure.

[0013] FIG. 5 illustrates a diagram depicting detailed operations 500 associated with multiple parallel-HPS pre-processing, according to some aspects of the present disclosure.

[0014] FIG. 6 illustrates a conceptual flow diagram of an exemplary data flow, according to some embodiments of the present disclosure.

[0015] FIG. 7 illustrates a diagram illustrating an exemplary LCP processing timeline, according to some embodiments of the present disclosure.

[0016] FIGs. 8A and 8B are a flowchart of an exemplary method of wireless communication, according to some embodiments of the present disclosure.

[0017] Embodiments of the present disclosure will be described with reference to the accompanying drawings.

DETAILED DESCRIPTION

[0018] Although specific configurations and arrangements are discussed, it should be understood that this is done for illustrative purposes only. A person skilled in the pertinent art will recognize that other configurations and arrangements can be used without departing from the spirit and scope of the present disclosure. It will be apparent to a person skilled in the pertinent art that the present disclosure can also be employed in a variety of other applications.

[0019] It is noted that references in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” “some embodiments,” “certain embodiments,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases do not necessarily refer to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of a person skilled in the pertinent art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.

[0020] In general, terminology may be understood at least in part from usage in context. For example, the term “one or more” as used herein, depending at least in part upon context, may be used to describe any feature, structure, or characteristic in a singular sense or may be used to describe combinations of features, structures or characteristics in a plural sense. Similarly, terms, such as “a,” “an,” or “the,” again, may be understood to convey a singular usage or to convey a plural usage, depending at least in part upon context. In addition, the term “based on” may be understood as not necessarily intended to convey an exclusive set of factors and may, instead, allow for existence of additional factors not necessarily expressly described, again, depending at least in part on context.

[0021] Various aspects of wireless communication systems will now be described with reference to various apparatus and methods. These apparatus and methods will be described in the following detailed description and illustrated in the accompanying drawings by various blocks, modules, units, components, circuits, steps, operations, processes, algorithms, etc. (collectively referred to as “elements”). These elements may be implemented using electronic hardware, firmware, computer software, or any combination thereof. Whether such elements are implemented as hardware, firmware, or software depends upon the particular application and design constraints imposed on the overall system.

[0022] The techniques described herein may be used for various wireless communication networks, such as code division multiple access (CDMA) system, time division multiple access (TDMA) system, frequency division multiple access (FDMA) system, orthogonal frequency division multiple access (OFDMA) system, single-carrier frequency division multiple access (SC- FDMA) system, wireless local area network (WLAN) system, and other networks. The terms “network” and “system” are often used interchangeably. A CDMA network may implement a radio access technology (RAT), such as Universal Terrestrial Radio Access (UTRA), evolved UTRA (E-UTRA), CDMA 2000, etc. A TDMA network may implement a RAT, such as the Global System for Mobile Communications (GSM). An OFDMA network may implement a RAT, such as LTE or NR. A WLAN system may implement a RAT, such as Wi-Fi. The techniques described herein may be used for the wireless networks and RATs mentioned above, as well as other wireless networks and RATs.

[0023] An important consideration of wireless communication relates to data rates, especially with the increased use of media streaming services. Carrier aggregation (CA) is one technique used in wireless communication to increase the data rate per user, whereby multiple frequency blocks (also referred to herein as “component carriers”) are assigned to the same UE for concurrent transmission using different parts of the frequency spectrum. Each component carrier (CC) may be associated with a different cell, e.g., base station. The maximum possible data rate per user is increased as the number of component carriers (CCs) assigned to a UE increases. CA also increases the sum data rate of a cell due to enhanced resource utilization and spectral efficiency.

[0024] While communicating using CA, a UE may be connected with two or more media access control (MAC) entities, which are each connected to a base station with multiple carriers of different bandwidths, a different number of available resources, and different radio channel conditions. To schedule the transmission of uplink (UL) data packets using CA, the UE receives multiple UL grants concurrently from different base stations. Each UL grant may schedule a different logical channel packet transmission on its respective CC.

[0025] A base station may send a UL grant to the UE using the physical downlink control channel (PDCCH) of that CC. The UE may receive the UL grant at the beginning of a slot (in a downlink control indicator (DCI)), which indicates the time at which the associated MAC PDU is scheduled for transmission. The UL grant may also indicate the number of resources (e.g., byte size) that have been allocated for the MAC PDU. The scheduled time may be equivalent to a time delay of K2 slot(s) from the slot in which the UL grant is received. FIG. 1A illustrates a first example timing diagram 100 in which K2 may be one or more slots away from the received UL grant, with a transmission start symbol at the slot boundary. FIG. IB illustrates a second example timing diagram 110 in which K2 is less than one, meaning the transmission start symbol S is in the same slot in which the UL grant is received. A UL grant that schedules a MAC PDU transmission in the same slot in which the grant was received indicates that the MAC PDU may be associated with a low-latency application, and hence, the UE may need to process the MAC PDU within milliseconds or microseconds.

[0026] The scheduling mechanism described above may apply to each CC, and hence, the UE may process MAC PDUs for multiple CCs concurrently. One challenge that relates to the scheduling of UL MAC grant scheduling is that for the UE may have to service multiple UL MAC grants (also referred to herein as “UL grants”) from multiple cells in a multiple CA configuration in which the UE is connected to two or more MAC entities, each connected to a different base station with multiple carriers of different bandwidth, resources, and radio channel conditions. In such a scenario, the UE needs to service multiple logical channel (LC) data packets efficiently and optimally among the carriers, without any de-synchronization or loss of data.

[0027] In another challenge, in order to optimize the power consumed by UE’s modem (also referred to herein as a “baseband chip”) in a multiple CA configuration, UL Layer 2 DP needs to support different application types, including high throughput high latency data transfers, as well as low latency low data rate applications. When operating in low data rate applications, the power usage should be minimized as much as possible without sacrificing quality-of-service (QoS) latency performance, or high throughput data performance.

[0028] In known devices, these challenges lead to various problems, e.g., such as an inefficient usage of central processing unit (CPU) cores and DP hardware resources when processing UL Layer 2 and Layer 3 data packets, inflexible DP hardware Layer 2/Layer 3 and cipher paths, a waste of CPU cores and DP hardware during low data-rate applications, insufficient CPU and DP hardware resources during high packet-rate use cases, an inability to dynamically detect under-run or over-loaded DP data-path scenarios that violate scheduled UL transmission timing, an inability to adjust resource utilization based on UL timing requirements in dynamic mixed traffic packet use cases, high power usage for single CC low data-rate applications, just to name a few.

[0029] Thus, there exists an unmet need for a baseband chip that may perform a UL timing analysis based on dynamic and semi-static input parameters, and derive the minimum resources that achieve scheduled UL transmission success.

[0030] To overcome these and other challenges, the present disclosure provides a baseband chip with LCQs in a double-data rate (DDR) memory external to a DP subsystem, which may autonomously page-in small packets to refill the small LCQs in local custom memory (CM) area. This brings the packet descriptors close to the microcontroller (uC) customized instructions (CX), which processes the packets, without the need for the uC to program a DMA controller to do so. This operation may be executed in the background autonomously, in parallel with the other packet processing functions, thus saving cycles. The threshold to trigger the page-in may be specified separately for control packets, normal data packets, and HPS. Whenever a burst of HPS are paged into the LCQ residing in the CM, the uC triggers multiple parallel-HPS pre-processing of the HPS. The multiple HPS-parallel pre-processing may include, e.g., robust header compression (ROHC) and preparing the service data adaptation protocol (SDAP), packet data convergence protocol (PDCP), radio link control (RLC), and media access control (MAC) headers. Since these are the highest priority packets in this logical channel (LC)/resource block (RB), the PDCP and RLC sequence numbers may be assigned immediately, and the headers can be constructed before reaching the uplink (UL) dataplane (DP) hardware. This HPS-pre-processing scheme (e.g., performed using background hardware threads or a co-processor located at the uC) reduces the number of cycles in LCP grant processing per packet. According to further aspects of the present disclosure, the LCP grant processing constructs the packet descriptor for each packet. The UL DP hardware may generate MAC sub-protocol data unit (SubPDU), which may be streamed through the cipher and integrity engine before they are sent to the physical layer (PHY) transmitter (TX) for transmission. For each LC, LCP CX transverses through the RLC and PDCP control queues and retransmission queues, and then retrieves the bulk pre-processed HPS before dequeuing the normal and low-priority packets. Using the present techniques described below in connection with FIGs. 2-8, the number of cycles per packet may be significantly reduced due to pre-processing packet descriptors for the bulk HPS, thereby boosting performance for high-data rate UL transmission, while saving power and reducing the overall silicon footprint of the DP subsystem. [0031] Although the following processing techniques are described in connection with Layer 2 data processing, the same or similar techniques may be applied to Layer 3 and/or Layer 4 data processing to optimize power consumption at the PHY, Layer 3, and/or Layer 4 subsystems without departing from the scope of the present disclosure.

[0032] FIG. 2 illustrates an exemplary wireless network 200, in which some aspects of the present disclosure may be implemented, according to some embodiments of the present disclosure. As shown in FIG. 2, wireless network 200 may include a network of nodes, such as user equipment 202, an access node 204, and a core network element 206. User equipment 202 may be any terminal device, such as a mobile phone, a desktop computer, a laptop computer, a tablet, a vehicle computer, a gaming console, a printer, a positioning device, a wearable electronic device, a smart sensor, or any other device capable of receiving, processing, and transmitting information, such as any member of a vehicle to everything (V2X) network, a cluster network, a smart grid node, or an Internet-of-Things (loT) node. It is understood that user equipment 202 is illustrated as a mobile phone simply by way of illustration and not by way of limitation.

[0033] Access node 204 may be a device that communicates with user equipment 202, such as a wireless access point, a base station (BS), a Node B, an enhanced Node B (eNodeB or eNB), a next-generation NodeB (gNodeB or gNB), a cluster master node, or the like. Access node 204 may have a wired connection to user equipment 202, a wireless connection to user equipment 202, or any combination thereof. Access node 204 may be connected to user equipment 202 by multiple connections, and user equipment 202 may be connected to other access nodes in addition to access node 204. Access node 204 may also be connected to other user equipments. When configured as a gNB, access node 204 may operate in millimeter wave (mmW) frequencies and/or near mmW frequencies in communication with the user equipment 202. When access node 204 operates in mmW or near mmW frequencies, the access node 204 may be referred to as an mmW base station. Extremely high frequency (EHF) is part of the radio frequency (RF) in the electromagnetic spectrum. EHF has a range of 30 GHz to 300 GHz and a wavelength between 1 millimeter and 10 millimeters. Radio waves in the band may be referred to as a millimeter wave. Near mmW may extend down to a frequency of 3 GHz with a wavelength of 200 millimeters. The super high frequency (SHF) band extends between 3 GHz and 30 GHz, also referred to as centimeter wave. Communications using the mmW or near mmW radio frequency band have extremely high path loss and a short range. The mmW base station may utilize beamforming with user equipment 202 to compensate for the extremely high path loss and short range. It is understood that access node 204 is illustrated by a radio tower by way of illustration and not by way of limitation.

[0034] Access nodes 204, which are collectively referred to as E-UTRAN in the evolved packet core network (EPC) and as NG-RAN in the 5G core network (5GC), interface with the EPC and 5GC, respectively, through dedicated backhaul links (e.g., SI interface). In addition to other functions, access node 204 may perform one or more of the following functions: transfer of user data, radio channel ciphering and deciphering, integrity protection, header compression, mobility control functions (e.g., handover, dual connectivity), inter-cell interference coordination, connection setup and release, load balancing, distribution for non-access stratum (NAS) messages, NAS node selection, synchronization, radio access network (RAN) sharing, multimedia broadcast multicast service (MBMS), subscriber and equipment trace, RAN information management (RIM), paging, positioning, and delivery of warning messages. Access nodes 204 may communicate directly or indirectly (e.g., through the 5GC) with each other over backhaul links (e.g., X2 interface). The backhaul links may be wired or wireless.

[0035] Core network element 206 may serve access node 204 and user equipment 202 to provide core network services. Examples of core network element 206 may include a home subscriber server (HSS), a mobility management entity (MME), a serving gateway (SGW), or a packet data network gateway (PGW). These are examples of core network elements of an evolved packet core (EPC) system, which is a core network for the LTE system. Other core network elements may be used in LTE and in other communication systems. In some embodiments, core network element 206 includes an access and mobility management function (AMF), a session management function (SMF), or a user plane function (UPF) of the 5GC for the NR system. The AMF may be in communication with a Unified Data Management (UDM). The AMF is the control node that processes the signaling between the user equipment 202 and the 5GC. Generally, the AMF provides QoS flow and session management. All user Internet protocol (IP) packets are transferred through the UPF. The UPF provides user equipment (UE) IP address allocation as well as other functions. The UPF is connected to the IP Services. The IP Services may include the Internet, an intranet, an IP Multimedia Subsystem (IMS), a PS Streaming Service, and/or other IP services. It is understood that core network element 206 is shown as a set of rack-mounted servers by way of illustration and not by way of limitation.

[0036] Core network element 206 may connect with a large network, such as the Internet 208, or another Internet Protocol (IP) network, to communicate packet data over any distance. In this way, data from user equipment 202 may be communicated to other user equipments connected to other access points, including, for example, a computer 210 connected to Internet 208, for example, using a wired connection or a wireless connection, or to a tablet 212 wirelessly connected to Internet 208 via a router 214. Thus, computer 210 and tablet 212 provide additional examples of possible user equipments, and router 214 provides an example of another possible access node. [0037] A generic example of a rack-mounted server is provided as an illustration of core network element 206. However, there may be multiple elements in the core network including database servers, such as a database 216, and security and authentication servers, such as an authentication server 218. Database 216 may, for example, manage data related to user subscription to network services. A home location register (HLR) is an example of a standardized database of subscriber information for a cellular network. Likewise, authentication server 218 may handle authentication of users, sessions, and so on. In the NR system, an authentication server function (AUSF) device may be the entity to perform user equipment authentication. In some embodiments, a single server rack may handle multiple such functions, such that the connections between core network element 206, authentication server 218, and database 216, may be local connections within a single rack.

[0038] Each element in FIG. 2 may be considered a node of wireless network 200. More detail regarding the possible implementation of a node is provided by way of example in the description of a node 300 in FIG. 3. Node 300 may be configured as user equipment 202, access node 204, or core network element 206 in FIG. 2. Similarly, node 300 may also be configured as computer 210, router 214, tablet 212, database 216, or authentication server 218 in FIG. 2. As shown in FIG. 3, node 300 may include a processor 302, a memory 304, and a transceiver 306. These components are shown as connected to one another by a bus, but other connection types are also permitted. When node 300 is user equipment 202, additional components may also be included, such as a user interface (UI), sensors, and the like. Similarly, node 300 may be implemented as a blade in a server system when node 300 is configured as core network element 206. Other implementations are also possible.

[0039] Transceiver 306 may include any suitable device for sending and/or receiving data. Node 300 may include one or more transceivers, although only one transceiver 306 is shown for simplicity of illustration. An antenna 308 is shown as a possible communication mechanism for node 300. Multiple antennas and/or arrays of antennas may be utilized for receiving multiple spatially multiplex data streams. Additionally, examples of node 300 may communicate using wired techniques rather than (or in addition to) wireless techniques. For example, access node 204 may communicate wirelessly to user equipment 202 and may communicate by a wired connection (for example, by optical or coaxial cable) to core network element 206. Other communication hardware, such as a network interface card (NIC), may be included as well.

[0040] As shown in FIG. 3, node 300 may include processor 302. Although only one processor is shown, it is understood that multiple processors can be included. Processor 302 may include microprocessors, microcontroller units (MCUs), digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), programmable logic devices (PLDs), state machines, gated logic, discrete hardware circuits, and other suitable hardware configured to perform the various functions described throughout the present disclosure. Processor 302 may be a hardware device having one or more processing cores. Processor 302 may execute software. Software shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise. Software can include computer instructions written in an interpreted language, a compiled language, or machine code. Other techniques for instructing hardware are also permitted under the broad category of software. [0041] As shown in FIG. 3, node 300 may also include memory 304. Although only one memory is shown, it is understood that multiple memories can be included. Memory 304 can broadly include both memory and storage. For example, memory 304 may include random-access memory (RAM), read-only memory (ROM), static RAM (SRAM), dynamic RAM (DRAM), ferroelectric RAM (FRAM), electrically erasable programmable ROM (EEPROM), compact disc readonly memory (CD-ROM) or other optical disk storage, hard disk drive (HDD), such as magnetic disk storage or other magnetic storage devices, Flash drive, solid-state drive (SSD), or any other medium that can be used to carry or store desired program code in the form of instructions that can be accessed and executed by processor 302. Broadly, memory 304 may be embodied by any computer-readable medium, such as a non-transitory computer-readable medium.

[0042] Processor 302, memory 304, and transceiver 306 may be implemented in various forms in node 300 for performing wireless communication functions. In some embodiments, at least two of processor 302, memory 304, and transceiver 306 are integrated into a single system- on-chip (SoC) or a single system-in-package (SiP). In some embodiments, processor 302, memory 304, and transceiver 306 of node 300 are implemented (e.g., integrated) on one or more SoCs. In one example, processor 302 and memory 304 may be integrated on an application processor (AP) SoC (sometimes known as a “host,” referred to herein as a “host chip”) that handles application processing in an operating system (OS) environment, including generating raw data to be transmitted. In another example, processor 302 and memory 304 may be integrated on a baseband processor (BP) SoC (sometimes known as a “modem,” referred to herein as a “baseband chip”) that converts the raw data, e.g., from the host chip, to signals that can be used to modulate the carrier frequency for transmission, and vice versa, which can run a real-time operating system (RTOS). In still another example, processor 302 and transceiver 306 (and memory 304 in some cases) may be integrated on an RF SoC (sometimes known as a “transceiver,” referred to herein as an “RF chip”) that transmits and receives RF signals with antenna 308. It is understood that in some examples, some or all of the host chip, baseband chip, and RF chip may be integrated as a single SoC. For example, a baseband chip and an RF chip may be integrated into a single SoC that manages all the radio functions for cellular communication.

[0043] Referring back to FIG. 2, in some embodiments, user equipment 202 may include a baseband chip that may perform LCQ automatic page-in refill technique using an external memory to refill the local LCQs located in a CM of a DP subsystem without DMA overhead from the UL uC. Moreover, the baseband chip may perform parallel LCP pre-processing of HPS to reduce processing latency by the UL DP hardware. Still further, user equipment 202 may perform an exemplary pipelined LCP grant processing for transmission of UL packets with a reduced number of cycles per packet. Additional details of the exemplary baseband chip and its associated exemplary techniques are provided below in connection with FIGs. 4-8.

[0044] FIG. 4 illustrates a detailed block diagram of an exemplary baseband chip 400 (referred to hereinafter as “baseband chip 400”), according to some embodiments of the present disclosure. As shown in FIG. 4, baseband chip 400 may include, e.g., a PHY subsystem with a PHY TX 402a and a PHY receiver (RX) 402b, a Layer 2 DP subsystem with UL DP hardware 404, a UL LCP uC 406, and a CM 414, external DDR memory 420 (referred to hereinafter as “DDR memory 420”), and a Layer 3 subsystem 422. UL LCP uC may include a co-processor or hardware thread 408, HPS custom instructions (CX) 410, and an LCP CX 412. CM 414 may include a first set of LCQs 416a with an HPS LCQ 418a, among other LCQs for non-HPS packets (e.g., control, retransmission packets, and regular UL packets). DDR memory 420 may include a second set of LCQs 416b with an HPS LCQ 418a and other LCQs for non-HPS packets.

[0045] High-data rate applications in the downlink (DL) direction may increase the number of uplink (UL) acknowledgement (ACK) small-packet burst that must be processed by baseband chip 400, significantly. These small-packet bursts may include transport control protocol (TCP) ACK packets, by way of example and not limitation. Additionally and/or alternatively, other high- data rate applications that may cause small-packet bursts may include industrial internet-of-things (IIoT) applications and video streaming applications. As such, it is important to reduce the number of processing cycles performed by the LCP CX 412 and UL DP hardware 404 to prepare a MAC SubPDU packet for transmission in a MAC PDU by PHY TX 402a.

[0046] To accomplish this, baseband chip 400 may include exemplary UL DP architecture, which includes the above-mentioned hardware, firmware, and memory, that performs main stages of operation to reduce the number of processing cycles, e.g., namely, stage A, stage B, and stage C, as described below. [0047] Stage A may include LCQ automatic page-in from an LCQ residing in DDR memory to refill a local LCQ. Stage A may begin when Layer 3 subsystem 422 sends (at 401) a burst of small packets, e.g., such as HPS, into HPS LCQ 418b of DDR memory 420. DDR memory hardware (not shown) may determine whether a threshold number of HPS reside in HPS LCQ 418b. In response to determining that the number of HPS residing in HPS LCQ 418b meets the threshold number, DDR memory 420 may cause (at 403) (e.g., by a direct memory access (DMA)) an autonomous page-in (at 403) of the HPS from HPS LCQ 418b into a corresponding HPS LCQ 418a in CM 414. As used herein, the phrase “autonomous page-in” may refer to the HPS being sent from DDR memory 420 to CM 414 without UL LCP uC 406 instructing a DMA controller to do so. Autonomous page-in reduces or eliminates the overhead of DMA controller programming by UL LCP uC 406, while at the same minimizing the number of uC million instructions per second (MIPs) cycles. Once the autonomous page-in of the HPS is complete, the operations of Stage B may begin.

[0048] Stage B may include multiple parallel-HPS pre-processing of HPS in a local CM. For example, at Stage B, CM 414 may send (at 405) a DMA complete message to HPS CX 410. When the DMA complete message is received, HPS CX may access the HPS in HPS LCQ 418a and perform (at 407) HPS pre-processing, which may include a parallel-processing paradigm scheme. For example, using the parallel-processing paradigm scheme, a large number of HPS may be prepared before LCP processing with ROHC compression, and with the headers formatted based on current running RLC sequence number (SN), and PDCP SN derived from current running TX Next (PDCP Count) values. In so doing, the real-time tight time constraint on per-packet LCP processing may be significantly reduced, thereby reducing power consumption and the number of processing cycles associated with HPS. Additional details of the operations associated with Stage B are provided below in connection with FIG. 5.

[0049] Stage C may include pipelined on-the-fly LCP-grant processing for transmission with a reduced number of cycles per packet. For example, at Stage C, when a PHYGrant is received (at 409) by UL LCP uC 406, LCP CX 412 may loop through all the LCQs to retrieve the packet descriptors from each LCQ and prepare (at 411) the Layer 2 command(s) (L2Cmd(s)) used by UL DP hardware 404 to perform Layer 2-packet processing. This includes dequeuing (at 413) the packet descriptor from each LCQ in the first set of LCQs 416a, performing ROHC compression, formatting the SDAP, PDCP, RLC, and MAC headers, and creating L2Cmd packet descriptors for use by UL DP hardware 404. Since the large number of HPS L2Cmd(s) are already prepared, they may be enqueued (at 415) into the L2CmdQ directly, saving significant MIPs cycles in the LCP foreground real-time processing, which reduces the per-packet processing time to a minimum. UL DP hardware 404 may read (at 417) the packet descriptor, stream the data bytes through the cipher and integrity engine, attach the headers, and send (at 419) the packet stream on-the-fly to PHY TX 402a for transmission over-the-air to a base station.

[0050] FIG. 5 illustrates a diagram depicting detailed operations 500 associated with multiple parallel-HPS pre-processing, according to some aspects of the present disclosure. The detailed operations 500 depicted in FIG. 5 may be associated with Stage B described above in connection with FIG. 4.

[0051] Referring to FIG. 5, whenever a burst of high priority small packets are autonomously paged into HPS LCQ 418a, HPS CX 410 may trigger multiple parallel-HPS preprocessing. The multiple parallel-HPS pre-processing may include, e.g., performing (at 501) ROHC compression and preparing (at 503) the SDAP, PDCP, RLC, and MAC headers. Since HPS LCQ 418a holds the high-priority packets, the PDCP SN and RLC SN are assigned immediately, and the headers can be constructed a priori. Lastly, the L2Cmd packet descriptor (L2Cmd PktDesc) may be constructed for the HPS and updated into LCQ 418, directly. New HPS are written to the bottom of LCQ 418a, and a read pointer (RdPtr) indicates the last HPS to be written to LCQ 418a. An HPS pointer (HPSPtr) is used for tracking the last LC2Cmd PktDesc in LCQ 418a that has been successfully pre-processed. This pointer will be chasing towards the LCQ WritePtr to ensure that all HPS in LCQ 418a have been pre-processed. The multiple HPS-parallel pre-processing via co-processor/hardware thread(s) 408 reduces the number of cycles in LCP grant processing per packet later.

[0052] FIG. 6 illustrates a conceptual flow diagram of an exemplary data flow 600 used to perform Stage A, Stage B, and Stage C described above in connection with FIG. 4, according to some embodiments of the present disclosure. FIG. 7 illustrates a diagram illustrating an exemplary LCP processing timeline 700 associated with the operations of Stage A, Stage B, and Stage C illustrated in the exemplary data flow 600 of FIG. 6, according to some embodiments of the present disclosure.

[0053] Referring to FIG. 6, the operations may be performed by, e.g., Layer 3 subsystem 422, LCQ(s) 416b of DDR memory 420, CM 414, HPS CX 410, LCP CX 412, UL DP hardware 404, PHY TX 402a, PHY RX 402b, and a base station 602.

[0054] To begin, Layer 3 subsystem 422 may send (at 601) an HPS burst to LCQ(s) 416b. DDR memory hardware and/or LCQ hardware coupled to HPS LCQ 418b may determine (at 603) whether the number of HPS in HPS LCQ 418b meets a threshold number. In response to determining that the threshold number of HPS is met, HPS LCQ 418b may cause (at 605) the HPS to be autonomously paged into CM 414 via DMA. More specifically, the HPS may be paged into HPS LCQ 418a of CM 414. These operations may be associated with Stage A.

[0055] Once the autonomous page-in is complete, the operations associated with Stage B may be performed. For example, CM 414 may send (at 607) a DMA done/complete message to HPS CX 410, which may initiate multiple parallel -HPS pre-processing the HPS in HPS LCQ 418a. The multiple parallel-HPS pre-processing may include, e.g., ROHC compression (at 609), the composition (at 611) of Layer 2 SDAP, PDCP, RLC, and MAC headers, and the construction (at 613) ofL2Cmd PktDesc.

[0056] Referring to Stage C, base station 602 may send (at 615), via the physical downlink control channel (PDCCH), downlink control information (DCI) that carries a PHY grant indication (Grantlnd) to PHY RX 402b. PHY RX 402b may send (at 617) the PHY Grantlnd to LCP CX 412. LCP CX 412 may read (at 619) the packet descriptors from the LC control and retransmission queues. LCP CX 412 may prepare (at 621) the Layer2Cmd for control and/or retransmission packets based on the packet descriptors read from the LC control and retransmission queues. Since the L2Cmd(s) for HPS are pre-processed by HPS CX 410, these L2Cmd are read (at 623) into LCP CX 412, which forwards them to UL DP hardware 404. The L2Cmd(s) non-HPS are also sent to UL DP hardware 404 to generate MAC SubPDUs for transmission by PHY TX 402a to base station 602.

[0057] FIGs. 8A and 8B are a flowchart of an exemplary method 800 of wireless communication, according to some embodiments of the present disclosure. Method 800 may be performed by an apparatus for wireless communication, e.g., such as a user equipment, a baseband chip, a UL DP subsystem, a UL LCP uC, a CM, an HPS LCQ, HPS CX, a co-processor and/or hardware thread(s), LCP CX, LCP queue, UL DP hardware, PHY RX, or PHY TX, just to name a few. Method 800 may include steps 802-822 as described below. It is to be appreciated that some of the steps may be optional, and some of the steps may be performed simultaneously, or in a different order than shown in FIGs. 8 A and 8B.

[0058] Referring to FIG. 8A, at 802, the apparatus may receive, in an HPS LCQ of an external memory, a plurality of HPS from a Layer 3 subsystem. For example, referring to FIG. 4, stage A may begin when Layer 3 subsystem 422 sends (at 401) a burst of small packets, e.g., such as HPS, into HPS LCQ 418b of DDR memory 420.

[0059] At 804, the apparatus may autonomously page-in, by the external memory, the plurality of HPS into an HPS LCQ of the CM in response to a number of HPS in the HPS LCQ of the external memory meeting a threshold number. For example, referring to FIG. 4, DDR memory hardware (not shown) may determine whether a threshold number of HPS reside in HPS LCQ 418b. In response to determining that the number of HPS residing in HPS LCQ 418b meets the threshold number, DDR memory 420 may cause (at 403) (e.g., by a direct memory access (DMA)) an autonomous page-in (at 403) of the HPS from HPS LCQ 418b into a corresponding HPS LCQ 418a in CM 414. As used herein, the phrase “autonomous page-in” may refer to the HPS being sent from DDR memory 420 to CM 414 without UL LCP uC 406 instructing a DMA controller to do so. Autonomous page-in reduces or eliminates the overhead of DMA controller programming by UL LCP uC 406, while at the same minimizing the number of uC MIPs cycles. Once the autonomous page-in of the HPS is complete, the operations of Stage B may begin.

[0060] At 806, the apparatus may receive, in an HPS LCQ of CM coupled to a UL LCP uC, the plurality of HPS from an external memory. For example, referring to FIG. 4, HPS LCQ 418a may receive the burst of HPS autonomously paged-in from DDR memory 420.

[0061] At 808, the apparatus may send, by the CM, a DMA complete message to HPS CX of the UL LCP uC once the plurality of HPS are received. For example, referring to FIG. 4, at Stage B, CM 414 may send (at 405) a DMA complete message to HPS CX 410.

[0062] At 810, the apparatus may perform, by the HPS CX of the UL LCP uC, HPS preprocessing of the plurality of HPS located in the HPS LCQ of the CM to generate a first set of Layer 2 commands for use by UL data plane DP hardware in response to receiving the DMA complete message. For example, referring to FIG. 4, when the DMA complete message is received, HPS CX may access the HPS in HPS LCQ 418a and perform (at 407) HPS pre-processing, which may include a parallel-processing paradigm scheme. For example, using the parallel-processing paradigm scheme, a large number of HPS may be prepared before LCP processing with ROHC compression, and with the headers formatted based on the current running RLC sequence number (SN), and PDCP SN derived from the current running TX Next (PDCP Count) values. In so doing, the real-time tight time constraint on per-packet LCP processing may be significantly reduced, thereby reducing power consumption and the number of processing cycles associated with HPS. Additional details of the operations associated with Stage B are provided above in connection with FIG. 5. [0063] At 812, the apparatus may receive, by LCP CX of the UL LCP uC, a UL grant from a PHY receiver. For example, referring to FIG. 4, at Stage C, a PHYGrant may be received (at 409) by UL LCP uC 406 from PHY RX 402b.

[0064] At 814, the apparatus may perform, by the LCP CX of the UL LCP uC, LCP processing of non-HPS packets to generate a second set of Layer 2 commands for use by the UL DP hardware in response to receiving the UL grant from the PHY receiver. For example, referring to FIG. 4, when the PHYGrant is received, LCP CX 412 may loop through all the LCQs to retrieve the packet descriptors from each non-HPS LCQ and prepare (at 411) the Layer 2 command(s) (L2Cmd(s)) used by UL DP hardware 404 to perform Layer 2-packet processing. This includes dequeuing (at 413) the packet descriptor from each non-HPS LCQ in the first set of LCQs 416a, performing ROHC compression, formatting the SDAP, PDCP, RLC, and MAC headers, and creating L2Cmd packet descriptors for use by UL DP hardware 404.

[0065] At 816, the apparatus may enqueue, by the LCP CX of the UL LCP uC, the first set of Layer 2 commands from the first HPS LCQ into a Layer 2 command queue of the CM in response to a completion of the LCP processing. For example, referring to FIG. 4, since the large number of HPS L2Cmd(s) are already prepared, they may be enqueued (at 415) into the L2CmdQ directly, saving significant MIPs cycles in the LCP foreground real-time processing, which reduces the per-packet processing time to a minimum.

[0066] Referring to FIG. 8B, at 818, the apparatus may receive, by the UL DP hardware, the first set of Layer 2 commands and the second set of Layer 2 commands. For example, referring to FIG. 4, UL DP hardware 404 may read (at 417) the packet descriptor, stream the data bytes through the cipher and integrity engine, and attach the headers.

[0067] At 820, the apparatus may generate a plurality of MAC SubPDUs based on the first set of Layer 2 commands and the second set of Layer commands. For example, referring to FIG. 4, UL DP hardware 404 may generate a MAC PDU for transmission by generating a plurality of MAC SubPDUs based on the first set of Layer 2 commands associated with HPS and the second set of Layer 2 commands associated with non-HPS.

[0068] At 822, the apparatus may transmit, by the PHY TX, the MAC PDU to a base station. For example, referring to FIG. 4, PHY TX 402a may receive (e.g., from UL DP hardware 404) a packet stream on-the-fly from UL DP hardware 404. PHY TX 402a may transmit the packet stream over-the-air to a base station.

[0069] In various aspects of the present disclosure, the functions described herein may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or encoded as instructions or code on a non-transitory computer-readable medium. Computer-readable media includes computer storage media. Storage media may be any available media that can be accessed by a computing device, such as node 300 in FIG. 3. By way of example, and not limitation, such computer-readable media can include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, HDD, such as magnetic disk storage or other magnetic storage devices, Flash drive, SSD, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a processing system, such as a mobile device or a computer. Disk and disc, as used herein, includes CD, laser disc, optical disc, digital video disc (DVD), and floppy disk where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media. [0070] According to one aspect of the present disclosure, a baseband chip is provided. The baseband chip may include CM. The CM may be configured to receive, in a first HPS LCQ, a plurality of HPS from an external memory. The CM may be configured to send a DMA complete message to HPS CX once the plurality of HPS are received. The HPS CX may be located in a UL LCP uC that includes a co-processor or a set of hardware threads to which the HPS CX are coupled. The baseband chip may include a Layer 2 DP subsystem. The Layer 2 DP subsystem may include UL DP hardware. The Layer 2 DP subsystem may include the UL LCP uC. The UL LCP uC may include the co-processor or a set of hardware threads, the HPS CX coupled to the co-processor or the set of hardware threads, and an LCP CX. The HPS CX may be configured to, in response to receiving the DMA complete message, perform HPS pre-processing of the plurality of HPS located in the first HPS LCQ to generate a first set of Layer 2 commands for use by the UL DP hardware. The LCP CX may be configured to receive a UL grant from a PHY receiver. The LCP CX may be configured to, in response to receiving the UL grant from the PHY receiver, perform LCP processing of non-HPS packets to generate a second set of Layer 2 commands for use by the UL DP hardware. The first set of Layer 2 commands may be generated prior to the UL grant being received by the LCP CX.

[0071] In some embodiments, to perform the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM, the HPS CX may be configured to perform ROHC for each HPS of the plurality of HPS. In some embodiments, to perform the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM, the HPS CX may be configured to generate SDAP, PDCP, RLC, and MAC headers for each HPS of the plurality of HPS. In some embodiments, to perform the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM, the HPS CX may be configured to construct Layer 2 command packet descriptors for each HPS of the plurality of HPS.

[0072] In some embodiments, the baseband chip may include an external memory. In some embodiments, the external memory may be configured to receive, in a second HPS LCQ, the plurality of HPS from a Layer 3 subsystem. In some embodiments, in response to a number of HPS in the second HPS LCQ meeting a threshold number, the external memory may be configured to autonomously page-in the plurality of HPS into the first HPS LCQ of the CM.

[0073] In some embodiments, the plurality of HPS may be autonomously paged into the first HPS LCQ of the CM by DMA.

[0074] In some embodiments, the LCP CX may be further configured to perform LCP processing of one or more of control packets, retransmission packets, or normal -priority packets residing in different LCQs of the CM.

[0075] In some embodiments, in response to a completion of the LCP processing, the LCP CX may be further configured to enqueue the second set of Layer 2 commands into the Layer 2 command queue of the CM.

[0076] In some embodiments, the UL DP hardware may be configured to receive the first set of Layer 2 commands and the second set of Layer 2 commands. In some embodiments, the UL DP hardware may be configured to generate a plurality of MAC SubPDUs based on the first set of Layer 2 commands and the second set of Layer 2 commands. In some embodiments, the UL DP hardware may be configured to send the MAC SubPDUs to a PHY TX for transmission to a base station.

[0077] According to another aspect of the present disclosure, an apparatus for wireless communication of a UE is provided. The UE may include a baseband chip. The baseband chip may include CM. The CM may be configured to receive, in a first HPS LCQ, a plurality of HPS from an external memory. The CM may be configured to send a DMA complete message to HPS CX once the plurality of HPS are received. The HPS CX may be located in a UL LCP uC that includes a co-processor or a set of hardware threads to which the HPS CX are coupled. The baseband chip may include a Layer 2 DP subsystem. The Layer 2 DP subsystem may include UL DP hardware. The Layer 2 DP subsystem may include the UL LCP uC. The UL LCP uC may include the co-processor or a set of hardware threads, the HPS CX coupled to the co-processor or the set of hardware threads, and an LCP CX. The HPS CX may be configured to, in response to receiving the DMA complete message, perform HPS pre-processing of the plurality of HPS located in the first HPS LCQ to generate a first set of Layer 2 commands for use by the UL DP hardware. The LCP CX may be configured to receive a UL grant from a PHY receiver. The LCP CX may be configured to, in response to receiving the UL grant from the PHY receiver, perform LCP processing of non-HPS packets to generate a second set of Layer 2 commands for use by the UL DP hardware. The first set of Layer 2 commands may be generated prior to the UL grant being received by the LCP CX.

[0078] In some embodiments, to perform the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM, the HPS CX may be configured to perform ROHC for each HPS of the plurality of HPS. In some embodiments, to perform the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM, the HPS CX may be configured to generate SDAP, PDCP, RLC, and MAC headers for each HPS of the plurality of HPS. In some embodiments, to perform the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM, the HPS CX may be configured to construct Layer 2 command packet descriptors for each HPS of the plurality of HPS.

[0079] In some embodiments, the baseband chip may include an external memory. In some embodiments, the external memory may be configured to receive, in a second HPS LCQ, the plurality of HPS from a Layer 3 subsystem. In some embodiments, in response to a number of HPS in the second HPS LCQ meeting a threshold number, the external memory may be configured to autonomously page-in the plurality of HPS into the first HPS LCQ of the CM.

[0080] In some embodiments, the plurality of HPS may be autonomously paged into the first HPS LCQ of the CM by DMA.

[0081] In some embodiments, the LCP CX may be further configured to perform LCP processing of one or more of control packets, retransmission packets, or normal -priority packets residing in different LCQs of the CM.

[0082] In some embodiments, in response to a completion of the LCP processing, the LCP CX may be further configured to enqueue the second set of Layer 2 commands into the Layer 2 command queue of the CM.

[0083] In some embodiments, the UL DP hardware may be configured to receive the first set of Layer 2 commands and the second set of Layer 2 commands. In some embodiments, the UL DP hardware may be configured to generate a plurality of MAC SubPDUs based on the first set of Layer 2 commands and the second set of Layer 2 commands. In some embodiments, the UL DP hardware may be configured to send the MAC SubPDUs to a PHY TX for transmission to a base station.

[0084] According to yet another aspect of the present disclosure, a method of wireless communication of a baseband chip is provided. The method may include receiving, in a first HPS LCQ of CM coupled to a UL uC, a plurality of HPS from an external memory. The method may include sending, by the CM, a DMA complete message to HPS CX of the UL LCP uC once the plurality of HPS are received. In response to receiving the DMA complete message, the method may include performing, by the HPS CX of the UL LCP uC, HPS pre-processing of the plurality of HPS located in the first HPS LCQ to generate a first set of Layer 2 commands for use by UL data plane DP hardware. The method may include receiving, by LCP CX of the UL LCP uC, a UL grant from a PHY receiver. In response to receiving the UL grant from the PHY receiver, the method may include performing, by the LCP CX of the UL LCP uC, LCP processing of non-HPS packets to generate a second set of Layer 2 commands for use by the UL DP hardware. The first set of Layer 2 commands may be generated prior to the UL grant being received by the LCP CX.

[0085] In some embodiments, the performing, by the HPS CX, the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM may include performing robust header compression (ROHC) for each HPS of the plurality of HPS. In some embodiments, the performing, by the HPS CX, the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM may include generating SDAP, PDCP, RLC, and MAC headers for each HPS of the plurality of HPS. In some embodiments, the performing, by the HPS CX, the HPS pre-processing of the plurality of HPS located in the first LCQ of the CM may include constructing Layer 2 command packet descriptors for each HPS of the plurality of HPS.

[0086] In some embodiments, the method may include receiving, in a second HPS LCQ of an external memory, the plurality of HPS from a Layer 3 subsystem. In some embodiments, in response to a number of HPS in the second HPS LCQ meeting a threshold number, autonomously paging-in, by the external memory, the plurality of HPS into the first HPS LCQ of the CM.

[0087] In some embodiment, the plurality of HPS may be autonomously paged into the first HPS LCQ of the CM by DMA.

[0088] In some embodiments, in response to a completion of the LCP processing, the method may include enqueuing, by the LCP CX of the UL LCP uC, the second set of Layer 2 commands into the Layer 2 command queue of the CM. [0089] In some embodiments, the method may include receiving, by the UL DP hardware, the first set of Layer 2 commands and the second set of Layer 2 commands. In some embodiments, the method may include generating, by the UL DP hardware, a plurality of MAC SubPDUs based on the first set of Layer 2 commands and the second set of Layer 2 commands. In some embodiments, the method may include sending, by the UL DP hardware, the MAC SubPDUs to a PHY TX for transmission to a base station.

[0090] The foregoing description of the specific embodiments will so reveal the general nature of the present disclosure that others can, by applying knowledge within the skill of the art, readily modify and/or adapt for various applications such specific embodiments, without undue experimentation, without departing from the general concept of the present disclosure. Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments, based on the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance.

[0091] Embodiments of the present disclosure have been described above with the aid of functional building blocks illustrating the implementation of specified functions and relationships thereof. The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed.

[0092] The Summary and Abstract sections may set forth one or more but not all exemplary embodiments of the present disclosure as contemplated by the inventor(s), and thus, are not intended to limit the present disclosure and the appended claims in any way.

[0093] Various functional blocks, modules, and steps are disclosed above. The particular arrangements provided are illustrative and without limitation. Accordingly, the functional blocks, modules, and steps may be re-ordered or combined in different ways than in the examples provided above. Likewise, certain embodiments include only a subset of the functional blocks, modules, and steps, and any such subset is permitted.

[0094] The breadth and scope of the present disclosure should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.