Age | Commit message (Collapse) | Author |
|
|
|
|
|
Thank you to @ya-ming for pointing this out!
|
|
|
|
|
|
Copying query or fragment needs to copy the "hasQuery" and
"hasFragment" flags.
Comparing URIs should make use of "hasQuery" and "hasFragment" to
properly compare URIs that might not have query and/or fragment parts.
|
|
Can't treat characters using "char" type because it's signed.
|
|
|
|
|
|
|
|
|
|
|
|
A trailing group which is definitely not an IPv4Address
needs to be counted. Detect this as the state being
IN_GROUP_NOT_IPV4 after the end of the string.
|
|
Assign names to states in the IPv6Address validation routine
|
|
* Multiple colons should not be accepted in state 4.
* After parsing a digit group and encountering a colon,
we need allow either another colon or the beginning
of either another group or an IPv4 address. Add state 5
to handle this.
|
|
|
|
|
|
Query and fragment may be empty but present in a URI.
Handle this in the same way that port is handled: include
a flag for each of query and fragment, to allow an
empty but present query/fragment.
|
|
* userinfo
* port (hasPort)
* path
* fragment
Also include these element when generating string from URI.
|
|
Add methods to set scheme, host, and query elements.
Add ability to generate URI strings out of scheme,
host, and query elements.
This does not yet support userinfo, port, or fragment elements.
|
|
For example, "[::1", where the square bracket at
the end is missing.
Handle truncated host element by checking the
state we end up in after the entire string is parsed.
Some states represent interal elements of a host name
or address, and so if we're still in those states and
run out of input characters, the input string was
cut off early.
|
|
Don't include the square brackets in the parsed out
host string; they are only there for delimiting them
inside of an overall URI string.
|
|
* Add ValidateIpv6Address.
* Add ValidateIpv4Address (since an IPv6 address is
allowed to contain an IPv4 address for compatibility)
* Add ValidateOctet (used by ValidateIpv4Address).
|
|
* Extract CanNavigatePathUpOneLevel from NormalizePath.
* Add comments to explain what's going on elsewhere
in NormalizePath.
|
|
|
|
|
|
* Extract methods to copy various elements of one URI from another.
* Push NormalizePath implementation into a private method.
* Simplify and consolidate checks for absolute paths.
* Extract methods out of individual steps of ParseFromString.
|
|
Add comments that link parts of the code back
to lines of the pseudocode in the RFC,
to make the code easier to understand.
|
|
The former algorithm was based on the pseuocode
from the RFC, which is hard to follow, more suitable
when the path is in a single string, not a sequence
of segments.
The new algorithm uses two flags:
* isAbsolute - recognize that if the path starts out
as an absolute path, it needs to stay that way.
* atDirectoryLevel - recognize that if we encounter
a "." or "..", then it will be reduced by simply
discarding it or going back/up one stop, but then
we will be in a "directory" context, meaning that
should we end the path at this point, there needs
to be an empty-string segment to mark that the
end of the path is reaching into a directory, not
just referring to the directory.
|
|
Path normalization is hideously broken for now.
|
|
|
|
Such a URI should be considered equivalent to a path of "/"
because in both cases the path is an absolute path.
|
|
For normalization "step 2C", if the output path was
empty, we don't want to pop the end of it off.
|
|
* Code the neat example in section 6.2.2 of the RFC.
* Add equality/inequality operators for Uri.
|
|
|
|
Extract methods that parse the query and fragment.
|
|
* Replaced the more formal "state machine" used in URI
elements that may have percent-encoded characters, with
a simpler loop with a flag and a few conditional logic
paths.
* Extracted the parsing of the above types of elements into
a common method, DecodeElement.
* Kept DecodeQueryOrFragment around, in order to prevent
having to repeat the name of the allowed character set which
is common between query and fragment; however the function
is now just a very thin wrapper.
|
|
* Remove IsCharacterInSet function
|
|
|
|
|
|
|
|
Added CharacterSet as a class to represent character sets,
allowing us to build singletons and composite character sets
more concisely.
|
|
* Extract IsCharacterInSet to its own module.
* Extract PercentEncodedCharacterDecoder to its own module.
|
|
Remove state 3 hole in host/port parsing state machine
|
|
Extract percent-encoded character decoding, so that
the logic is all in one class that is reused.
|
|
|
|
|
|
|
|
Path may also have colon, so make sure we don't scan
into the path element if there is one.
|
|
* Detect bad characters in host names.
* Incorporate splitting host and port into the state
machine that is parsing/decoding the host.
NOTE:
IPv6address is not checked for bad characters yet.
More research is needed to learn exactly what are
the various ways to write an IPv6 address.
|