mirror of
https://github.com/nushell/nushell.git
synced 2025-05-05 23:42:56 +00:00
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR allows `from xml` to parse XML documents with [document type declarations](https://en.wikipedia.org/wiki/Document_type_declaration) by default. This is especially notable since many HTML documents start with `<!DOCTYPE html>`, and `roxmltree` should be able to parse some simple HTML documents. The security concerns with DTDs are [XXE attacks](https://en.wikipedia.org/wiki/XML_external_entity_attack), and [exponential entity expansion attacks](https://en.wikipedia.org/wiki/Billion_laughs_attack). `roxmltree` [doesn't support](d2c7801624/src/tokenizer.rs (L535-L547)
) external entities (it parses them, but doesn't do anything with them), so it is not vulnerable to XXE attacks. Additionally, `roxmltree` has [some safeguards](d2c7801624/src/parse.rs (L424-L452)
) in place to prevent exponential entity expansion, so enabling DTDs by default is relatively safe. The worst case is no worse than running `loop {}`, so I think allowing DTDs by default is best, and DTDs can still be disabled with `--disallow-dtd` if needed. # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> * Allows `from xml` to parse XML documents with [document type declarations](https://en.wikipedia.org/wiki/Document_type_declaration) by default, and adds a `--disallow-dtd` flag to disallow parsing documents with DTDs. This PR also improves the errors in `from xml` by pointing at the issue in the XML source. Example: ``` $ open --raw foo.xml | from xml Error: × Failed to parse XML ╭─[2:7] 1 │ <html> 2 │ <p<>hi</p> · ▲ · ╰── Unexpected character <, expected a whitespace 3 │ </html> ╰──── ``` # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> N/A # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> N/A
This crate contains the majority of our commands
We allow ourselves to move some of the commands in nu-command
to nu-cmd-*
crates as needed.
Internal Nushell crate
This crate implements components of Nushell and is not designed to support plugin authors or other users directly.