Skip to content

Non-ASCII character support#45736

Merged
nodejs-github-bot merged 1 commit into
nodejs:mainfrom
mertcanaltin:bug-45706
Feb 18, 2023
Merged

Non-ASCII character support#45736
nodejs-github-bot merged 1 commit into
nodejs:mainfrom
mertcanaltin:bug-45706

Conversation

@mertcanaltin

@mertcanaltin mertcanaltin commented Dec 4, 2022

Copy link
Copy Markdown
Member

Fixes #45706
Fixes #46508

  • Is the character's ASCII code greater than 127? (In this case, the function returns true)
  • Is the character a number? (In this case, the function returns false)
  • Is the character a "#" symbol? (In this case, the function returns false)
  • If no conditions are met, the function returns true.

@nodejs-github-bot nodejs-github-bot added dont-land-on-v14.x needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem. labels Dec 4, 2022
@mertcanaltin

mertcanaltin commented Dec 4, 2022

Copy link
Copy Markdown
Member Author

test

 const  isLiteralSymbol = (char) => {
    const code = char.charCodeAt(0);

    if (code > 127) {
      return true;
    }

    if (char >= '0' && char <= 9) {
      return false;
    }

    if (code === 35) {
      return false;
    }

    return true;
  }

output

isLiteralSymbol('أهلا')
true

Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread test/parallel/test-runner-tap-lexer.js Outdated
@anonrig anonrig added the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Dec 4, 2022
@mertcanaltin

Copy link
Copy Markdown
Member Author

@anonrig I applied the changes, thank you very much for the review 🚀

@mertcanaltin mertcanaltin requested a review from anonrig December 4, 2022 22:36
Comment thread test/parallel/test-runner-tap-lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
@anonrig anonrig requested a review from MoLow December 4, 2022 23:03
@anonrig

anonrig commented Dec 4, 2022

Copy link
Copy Markdown
Member

@manekinekko Can you review this?

@MoLow MoLow left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change LGTM, but this seems to break quite a few tests

Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread test/parallel/test-runner-tap-lexer.js Outdated
assert.strictEqual(isLiteralSymbol('#'), false);
assert.strictEqual(isLiteralSymbol('أ'), true);
assert.strictEqual(isLiteralSymbol('ت'), true);
assert.strictEqual(isLiteralSymbol('ث'), true);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you:

  1. group these into 2 groups: literals (true) and non-literals (false)?
  2. add more samples of non-Latin characters? you can cherry-pick from this list here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manekinekko thank you so much i will fly here 🚀

@mertcanaltin mertcanaltin Dec 5, 2022

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manekinekko would that be healthy

{
  const literals = ['A', 'a', '-', '+', 'أ', 'ت', 'ث', '讲', '演', '講'];
  const nonLiterals = ['0', '#', '\\', '+', '-'];
  
  literals.forEach((literal) => {
  assert.strictEqual(isLiteralSymbol(literal), true);
  });
  
  nonLiterals.forEach((nonLiteral) => {
  assert.strictEqual(isLiteralSymbol(nonLiteral), false);
  });
}

@mertcanaltin mertcanaltin requested review from MoLow, anonrig and manekinekko and removed request for MoLow, anonrig and manekinekko December 5, 2022 14:59
@mertcanaltin

Copy link
Copy Markdown
Member Author

I'm so sorry I triggered you all to review, sorry for the extra notifications :/

Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread test/parallel/test-runner-tap-lexer.js
@mertcanaltin

Copy link
Copy Markdown
Member Author

@manekinekko I sent the edits thank you very much

@BridgeAR BridgeAR left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we really accept zero width characters as acceptable input? I would skip all of them. We also already have a function to check for these:

const isZeroWidthCodePoint = (code) => {
return code <= 0x1F || // C0 control codes
(code >= 0x7F && code <= 0x9F) || // C1 control codes
(code >= 0x300 && code <= 0x36F) || // Combining Diacritical Marks
(code >= 0x200B && code <= 0x200F) || // Modifying Invisible Characters
// Combining Diacritical Marks for Symbols
(code >= 0x20D0 && code <= 0x20FF) ||
(code >= 0xFE00 && code <= 0xFE0F) || // Variation Selectors
(code >= 0xFE20 && code <= 0xFE2F) || // Combining Half Marks
(code >= 0xE0100 && code <= 0xE01EF); // Variation Selectors
};

@mertcanaltin

mertcanaltin commented Dec 7, 2022

Copy link
Copy Markdown
Member Author

I didn't accept zero-width characters @BridgeAR i will update the code like this

  if (typeof char !== 'string' || util.inspect.isZeroWidthCodePoint(char)) {
      return false;
    }

can I do it like this?

@mertcanaltin mertcanaltin reopened this Dec 7, 2022
@cjihrig

cjihrig commented Dec 12, 2022

Copy link
Copy Markdown
Contributor

@mertcanaltin I think that would be ok.

@MoLow

MoLow commented Feb 8, 2023

Copy link
Copy Markdown
Member

CC @nodejs/test_runner @manekinekko additional reviews will be appreciated

@MoLow

MoLow commented Feb 8, 2023

Copy link
Copy Markdown
Member

@anonrig can you please dismiss your review/approve?

@MoLow MoLow removed the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Feb 8, 2023
@MoLow MoLow added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. and removed author ready PRs that have at least one approval, no pending requests for changes, and a CI started. labels Feb 9, 2023
@MoLow MoLow added the request-ci Add this label to start a Jenkins CI on a PR. label Feb 18, 2023
@github-actions github-actions Bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Feb 18, 2023
@nodejs-github-bot

Copy link
Copy Markdown
Collaborator

@MoLow MoLow added the commit-queue Add this label to land a pull request using GitHub Actions. label Feb 18, 2023
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Feb 18, 2023
@nodejs-github-bot nodejs-github-bot merged commit 3c6547f into nodejs:main Feb 18, 2023
@nodejs-github-bot

Copy link
Copy Markdown
Collaborator

Landed in 3c6547f

@mertcanaltin mertcanaltin changed the title [WIP] Non-ASCII character support Non-ASCII character support Feb 18, 2023
@mertcanaltin

Copy link
Copy Markdown
Member Author

🎉

MoLow pushed a commit to MoLow/node that referenced this pull request Feb 18, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MylesBorins pushed a commit that referenced this pull request Feb 18, 2023
PR-URL: #45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
@MylesBorins MylesBorins mentioned this pull request Feb 19, 2023
MylesBorins pushed a commit that referenced this pull request Feb 20, 2023
PR-URL: #45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MoLow pushed a commit to MoLow/node that referenced this pull request Feb 25, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MoLow pushed a commit to MoLow/node that referenced this pull request Feb 25, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MoLow pushed a commit to MoLow/node that referenced this pull request Feb 25, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
juanarbol pushed a commit that referenced this pull request Mar 3, 2023
PR-URL: #45736
Backport-PR-URL: #46839
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
@juanarbol juanarbol mentioned this pull request Mar 3, 2023
juanarbol pushed a commit that referenced this pull request Mar 5, 2023
PR-URL: #45736
Backport-PR-URL: #46839
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test runner regression in 19.2.0 Tap parser fails when test name includes non ASCII characters

9 participants