Skip to content

std: Stabilize more of the char module#23126

Merged
bors merged 1 commit into
rust-lang:masterfrom
alexcrichton:char-third-pass
Mar 11, 2015
Merged

std: Stabilize more of the char module#23126
bors merged 1 commit into
rust-lang:masterfrom
alexcrichton:char-third-pass

Conversation

@alexcrichton

Copy link
Copy Markdown
Member

This commit performs another pass over the std::char module for stabilization.
Some minor cleanup is performed such as migrating documentation from libcore to
libunicode (where the std-facing trait resides) as well as a slight
reorganiation in libunicode itself. Otherwise, the stability modifications made
are:

  • char::from_digit is now stable
  • CharExt::is_digit is now stable
  • CharExt::to_digit is now stable
  • CharExt::to_{lower,upper}case are now stable after being modified to return
    an iterator over characters. While the implementation today has not changed
    this should allow us to implement the full set of case conversions in unicode
    where some characters can map to multiple when doing an upper or lower case
    mapping.
  • StrExt::to_{lower,upper}case was added as unstable for a convenience of not
    having to worry about characters expanding to more characters when you just
    want the whole string to get into upper or lower case.

This is a breaking change due to the change in the signatures of the
CharExt::to_{upper,lower}case methods. Code can be updated to use functions
like flat_map or collect to handle the difference.

[breaking-change]

Closes #20333

@rust-highfive

Copy link
Copy Markdown
Contributor

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton

Copy link
Copy Markdown
Member Author

r? @aturon
r? @SimonSapin

@aturon aturon mentioned this pull request Mar 6, 2015
91 tasks
@alexcrichton alexcrichton force-pushed the char-third-pass branch 2 times, most recently from b0862e5 to 6d6cbbd Compare March 6, 2015 19:46
@SimonSapin

Copy link
Copy Markdown
Contributor

CharExt::to_{lower,upper}case are now stable after being modified to return an iterator over characters. While the implementation today has not changed this should allow us to implement the full set of case conversions in unicode where some characters can map to multiple when doing an upper or lower case mapping.

Does "stable" mean we can still change their behavior after 1.0?

@alexcrichton

Copy link
Copy Markdown
Member Author

Does "stable" mean we can still change their behavior after 1.0?

I believe that we reserve the right to update the unicode standard we're using, which can add new case mappings for existing characters. Along those lines, I see actually parsing the extra data and returning many characters as a similar enhancement.

All in all yes, I believe that this is a change we can make after 1.0

Comment thread src/libunicode/char.rs Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one u16

@alexcrichton

Copy link
Copy Markdown
Member Author

Pushed some updates, thanks for the comments @tbu- and @SimonSapin!

@SimonSapin

Copy link
Copy Markdown
Contributor

LGTM.

@aturon

aturon commented Mar 10, 2015

Copy link
Copy Markdown
Contributor

@bors: r+ d33b308

@aturon

aturon commented Mar 10, 2015

Copy link
Copy Markdown
Contributor

Note: this closes #20333

@SimonSapin

Copy link
Copy Markdown
Contributor

Note: this closes #20333

Returning Iterator<Item=char> is a better (more idiomatic) solution than the Option<&'static str> that I suggested in that bug. (Maybe String::to_{upper,lower}case can gain some performance by not re-encoding to UTF-8 (though that remains to be proven) but it can still do so internally without affecting the API of the char methods.)

@aturon

aturon commented Mar 10, 2015

Copy link
Copy Markdown
Contributor

@SimonSapin Oh, I agree, but I still take this to be addressing the underlying issue of that bug.

@SimonSapin

Copy link
Copy Markdown
Contributor

My message was a long way of saying “+1” :)

@bors

bors commented Mar 10, 2015

Copy link
Copy Markdown
Collaborator

⌛ Testing commit d33b308 with merge e840e74...

@bors

bors commented Mar 10, 2015

Copy link
Copy Markdown
Collaborator

💔 Test failed - auto-linux-32-opt

@alexcrichton

Copy link
Copy Markdown
Member Author

@bors: r+

@bors

bors commented Mar 10, 2015

Copy link
Copy Markdown
Collaborator

@bors r=alexcrichton e74a79a

@alexcrichton

Copy link
Copy Markdown
Member Author

@bors: r=aturon

@bors

bors commented Mar 10, 2015

Copy link
Copy Markdown
Collaborator

@bors r=aturon e74a79a

@bors

bors commented Mar 10, 2015

Copy link
Copy Markdown
Collaborator

⌛ Testing commit e74a79a with merge fde1502...

@bors

bors commented Mar 10, 2015

Copy link
Copy Markdown
Collaborator

💔 Test failed - auto-win-32-opt

@alexcrichton

Copy link
Copy Markdown
Member Author

@bors: r=aturon 3bcd209

This commit performs another pass over the `std::char` module for stabilization.
Some minor cleanup is performed such as migrating documentation from libcore to
libunicode (where the `std`-facing trait resides) as well as a slight
reorganiation in libunicode itself. Otherwise, the stability modifications made
are:

* `char::from_digit` is now stable
* `CharExt::is_digit` is now stable
* `CharExt::to_digit` is now stable
* `CharExt::to_{lower,upper}case` are now stable after being modified to return
  an iterator over characters. While the implementation today has not changed
  this should allow us to implement the full set of case conversions in unicode
  where some characters can map to multiple when doing an upper or lower case
  mapping.
* `StrExt::to_{lower,upper}case` was added as unstable for a convenience of not
  having to worry about characters expanding to more characters when you just
  want the whole string to get into upper or lower case.

This is a breaking change due to the change in the signatures of the
`CharExt::to_{upper,lower}case` methods. Code can be updated to use functions
like `flat_map` or `collect` to handle the difference.

[breaking-change]
@alexcrichton

Copy link
Copy Markdown
Member Author

@bors: r=aturon 0f6a0b5

bors added a commit that referenced this pull request Mar 10, 2015
This commit performs another pass over the `std::char` module for stabilization.
Some minor cleanup is performed such as migrating documentation from libcore to
libunicode (where the `std`-facing trait resides) as well as a slight
reorganiation in libunicode itself. Otherwise, the stability modifications made
are:

* `char::from_digit` is now stable
* `CharExt::is_digit` is now stable
* `CharExt::to_digit` is now stable
* `CharExt::to_{lower,upper}case` are now stable after being modified to return
  an iterator over characters. While the implementation today has not changed
  this should allow us to implement the full set of case conversions in unicode
  where some characters can map to multiple when doing an upper or lower case
  mapping.
* `StrExt::to_{lower,upper}case` was added as unstable for a convenience of not
  having to worry about characters expanding to more characters when you just
  want the whole string to get into upper or lower case.

This is a breaking change due to the change in the signatures of the
`CharExt::to_{upper,lower}case` methods. Code can be updated to use functions
like `flat_map` or `collect` to handle the difference.

[breaking-change]

Closes #20333
@bors

bors commented Mar 10, 2015

Copy link
Copy Markdown
Collaborator

⌛ Testing commit 0f6a0b5 with merge cfea8ec...

@bors

bors commented Mar 11, 2015

Copy link
Copy Markdown
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Char::to_{lower,upper}case should return Option<&'static str> instead of char

6 participants