> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nrev.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Limit(N) Node

> Restrict the number of rows returned from a DataFrame, supporting simple row limiting and advanced grouping with optional sorting.

## What It Does

* Restrict rows globally or within groups, limiting dataset size.
* Optional sorting of rows based on any specified column.
* Allows selective data preprocessing (grouping keys and column to sort).
* Supports multiple grouping keys to limit rows per category.
* Graceful fallback when `limit_across_groups` is true but no grouping keys provided.

***

## 🏁 Getting Started

<Frame>
  <img src="https://mintcdn.com/nurturev/mHCimB8YAP--VMGj/images/Limit%20Top%20N%20Node%20config%20screenshot.png?fit=max&auto=format&n=mHCimB8YAP--VMGj&q=85&s=d78a0c130f8b29b3c69c1e99781756bf" alt="Limit Top N Node config screenshot" style={{ borderRadius: '0.5rem', width: '100%', margin: '1.5rem 0' }} width="1180" height="1664" data-path="images/Limit Top N Node config screenshot.png" />
</Frame>

<Steps>
  <Step title="Add the Limit (Top N) Node">Drag and drop the Limit (Top N) Node into your workflow.</Step>
  <Step title="Define Limit Settings">Specify the number of rows to return, sorting options, and grouping keys if required.</Step>
  <Step title="Run the Workflow">Execute the workflow to limit the rows in the output DataFrame.</Step>
  <Step title="Monitor the Output">The output DataFrame will contain the same columns as the input with a limited number of rows based on your settings.</Step>
</Steps>

***

## Inputs

| Input Name        | Type            | Required                            | Description                                                                                      |
| ----------------- | --------------- | ----------------------------------- | ------------------------------------------------------------------------------------------------ |
| `input_df_s3_url` | `Optional[str]` | Yes, if template variables are used | S3 URL to the input DataFrame (CSV/Parquet). Required when using template variables in settings. |

***

## Outputs

The node returns a **List\[Dict\[str, Any]]** where each dictionary contains:

| Output Name         | Type   | Description                                                                          |
| ------------------- | ------ | ------------------------------------------------------------------------------------ |
| `s3_output_url`     | `str`  | S3 URL of the output DataFrame (Parquet format)                                      |
| `s3_output_url_csv` | `str`  | S3 URL of the output DataFrame (CSV format)                                          |
| `file_info`         | `Dict` | Contains metadata: `rows_count` (int), `columns_count` (int), `columns` (List\[str]) |
| `handle_condition`  | `str`  | Always `"_default"` for this node (no conditional outputs)                           |

### Output DataFrame Structure

The output DataFrame will contain the same columns as the input DataFrame, with the following characteristics:

* **All input columns preserved**: No columns are added or removed.
* **Row count limited**: The number of rows is reduced based on limit settings.
* **Selective preprocessing**: Only grouping keys are preprocessed.
* **Grouping keys**: Converted to string format with nulls replaced by '(Empty)'.
* **Column to sort**: No preprocessing - uses pandas default null handling.
* **Other columns**: Preserved in original format.
* **Sorting applied**: If specified, rows are sorted by the designated column using pandas default null handling.

***

## How It Works

1. **Data Loading**: Loads input data from S3 using the data loading helper.
2. **Field Validation**: Ensures all referenced columns exist in the input data.
3. **Data Preprocessing**:
   * **Grouping keys**: Converts to string format and replaces nulls with '(Empty)'.
   * **Column to sort**: No preprocessing applied - uses pandas default null handling.
   * **Other columns**: Left unchanged.
4. **Limit Logic Application**:
   * **Without Grouping**: Applies limit to the entire dataset.
   * **With Grouping**: Groups data by specified keys, applies limit to each group, then combines results.
   * **Graceful Fallback**: If `limit_across_groups` is True but no grouping keys provided, behaves as if no grouping.
5. **Sorting**: If `column_to_sort` is specified, sorts data before applying limit using pandas default null handling.
6. **Test Mode**: If enabled, limits output to 5 rows regardless of limit setting.
7. **Output Generation**: Saves results to S3 in both Parquet and CSV formats.

***

## 🚀 Example Use Cases & Prompts

| Use Case                     | Setup or Prompt Example                                       |
| ---------------------------- | ------------------------------------------------------------- |
| **Sampling Large Datasets**  | Limit to a small number of rows for a preview                 |
| **Top N by Metric**          | Limit to top N rows based on a sorting column (e.g., `score`) |
| **Grouped Limiting**         | Limit rows within groups (e.g., top N customers per region)   |
| **Performance Optimization** | Reduce the dataset size for faster processing                 |

***

## ✨ Pro Tips

<Tip>
  Use **grouping\_keys** for applying limits within different categories (e.g., top N customers per region).
</Tip>

<Tip>
  If your dataset is large, use **Test Mode** to preview the output with just 5 rows for quick validation.
</Tip>

***

## ⚠️ Important Considerations

<Warning>
  If `limit_across_groups` is set to True but no `grouping_keys` are provided, the node will behave as if `limit_across_groups` is False.
</Warning>

<Warning>
  Sorting will use pandas default null handling: null values will be placed at the end for ascending and at the beginning for descending.
</Warning>

***

## 🛠 Troubleshooting & Gotchas

| Symptom                 | Likely Cause            | Quick Fix                                                       |
| ----------------------- | ----------------------- | --------------------------------------------------------------- |
| No rows in output       | Missing `grouping_keys` | Ensure `grouping_keys` is set if `limit_across_groups` is True. |
| Unexpected column order | Column sorting issue    | Verify `column_to_sort` and `sorting_order` settings.           |
| No data found           | Invalid S3 URL          | Ensure correct S3 URL is provided for the input DataFrame.      |

***

## 📝 FAQ

<AccordionGroup>
  <Accordion title="Can I apply a limit within groups?">
    Yes, set **`limit_across_groups`** to true and specify **`grouping_keys`** to limit rows within each group.
  </Accordion>

  <Accordion title="What happens if no sorting column is specified?">
    The node will apply the limit to the unsorted data, returning rows in their original order.
  </Accordion>
</AccordionGroup>

***

## 💰 Pricing

The **Limit (Top N) Node** incurs no additional cost for limiting rows.

| Action       | Credit Cost |
| ------------ | ----------- |
| Row Limiting | 0 credits   |

<Note>
  There is no charge for this node unless it's used in conjunction with other nodes that incur charges.
</Note>

***

<p style={{ fontSize: '1rem', fontWeight: 'bold', marginTop: '1.5rem' }}>
  Drop this node into your flow to efficiently limit the number of rows and optimize data processing. 🚀
</p>
