Coding Standards
1. Strings
1.1. Concatenation
ý Do not use
the ‘+’ operator to concatenate many strings. Instead, you should use
StringBuilder for concatenation. However, do use the ‘+’ operator to
concatenate small numbers of strings.
Good:
StringBuilder sb = new StringBuilder();
for
(int i = 0; i < 10; i++)
{
sb.Append(i.ToString());
}
Bad:
string str = string.Empty;
for
(int i = 0; i < 10; i++)
{
str += i.ToString();
}
1.2. Comparison
ý Do not use an overload of
the String.Compare or CompareTo method and test
for a return value of zero to determine whether two strings are equal. They are
used to sort strings, not to check for equality.
þ Do use the String.ToUpperInvariant method instead of
the String.ToLowerInvariant method when you
normalize strings for comparison.
þ Do use overloads that
explicitly specify the string comparison rules for string operations. Typically,
this involves calling a method overload that has a parameter of type StringComparison.
þ Do use StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase for comparisons as
your safe default for culture-agnostic string matching, and for better
performance.
þ Do use string
operations that are based on StringComparison.CurrentCulture when you display
output to the user.
þ Do use the
non-linguistic StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase values instead of
string operations based on CultureInfo.InvariantCulture when the
comparison is linguistically irrelevant (symbolic, for example). Do not use string operations based on
StringComparison.InvariantCulture in most cases. One of the few exceptions is
when you are persisting linguistically meaningful but culturally agnostic data.
The
Turkish-I Problem
When comparing strings in
.NET, there are a few pitfalls such as the Turkish-I problem.
These new recommendations
and APIs exist to alleviate misguided assumptions about the behavior of default
string APIs. The canonical example of bugs emerging where non-linguistic string
data is interpreted linguistically is the "Turkish-I" problem.
For nearly all Latin alphabets, including
U.S. English, the character i (\u0069) is the lowercase
version of the character I (\u0049). This casing rule quickly
becomes the default for someone programming in such a culture. However, in
Turkish ("tr-TR"), there exists a capital "i with a
dot," character
(\u0130), which is the capital version of i.
Similarly, in Turkish, there is a lowercase "i without a dot,"
or
(\u0131), which capitalizes to I. This
behavior occurs in the Azeri culture ("az") as well.
Therefore, assumptions normally made about
capitalizing i or lowercasing I are not valid
among all cultures. If the default overloads for string comparison routines are
used, they will be subject to variance between cultures. For non-linguistic
data, as in the following example, this can produce undesired results:
Because of the difference of the comparison
of I, results of the comparisons change when the thread culture is changed.
This is the output:
This could cause real
problems if the culture is inadvertently used in security-sensitive settings:
static String
IsFileURI(String path) {
return (String.Compare(path, 0, "FILE:", 0, 5, true) == 0);
}
Something
like IsFileURI("file:") would return true with
a current culture of U.S. English, but false if the culture is
Turkish. Thus, on Turkish systems, one could likely circumvent security
measures to block access to case-insensitive URIs beginning with "FILE:".
Because "file:" is meant to be interpreted as a
non-linguistic, culture-insensitive identifier, the code should instead be
written this way:
static String
IsFileURI(String path) {
return (String.Compare(path, 0, "FILE:", 0, 5,
StringComparison.OrdinalIgnoreCase) ==
0);
}
Because of the Turkish-I
problem, the .NET team originally recommended using InvariantCulture as
the primary cross-culture comparison type. The previous code would then read:
static String
IsFileURI(String path) {
return (String.Compare(path, 0, "FILE:", 0, 5, true,
CultureInfo.InvariantCulture) == 0);
}
Comparisons using InvariantCulture and Ordinal will
work identically when used on ASCII strings; however, InvariantCulture will
make linguistic decisions that might not be appropriate for strings that need
to be interpreted as a set of bytes.
Using the CultureInfo.InvariantCulture.CompareInfo,
certain sets of characters are made equivalent under Compare(). For
example, the following equivalence holds under the invariant culture:
InvariantCulture:
a +
= å
The "latin small letter
a" (\u0061) character a, when next to the "combining ring above"
(\u030a) character
, will be interpreted as the "latin small letter a with
ring above" (\u00e5) character å.
Example:
This prints out:
So, when interpreting file names, cookies, or anything else
where something like the å combination can appear,
ordinal comparisons still offer the most transparent and fitting behavior.
1.3. Equality
2. DateTime
þ You
should use only the Utc date time values.
var
createdOn = DateTime.UtcNow;
No comments:
Post a Comment