Strings in D

Strings in the D programming language (and in many other languages too) are stored as an array of characters. Sometimes you might forget that you are still working with an array. The following code for example might result in unexpected behaviour.

char[] a = “foo”;
char[] b = “foo”;
a[0] = ‘x’;                    // access violation (Linux)
Stdout(a ~ ” “ ~ b).newline(); // outputs “xoo xoo” (WinXP)

If you assign something to an array like that it will actually just point to the memory address of “foo”. In this case “foo” is a string literal. String literals are normally stored inside the data block of the program, and depending on the operation system and access restrictions you might not be able to modify this but get an access violation instead. In D 2.0 you might even get a compiler error for assigning a string literal to a dynamic array.

To fix this code there are basically two ways. The simplest way is to copy “foo” with the .dup attribute which creates a new dynamic array in memory and copies the content of the old array into it. The second way requires more work and uses the slice operation. For this the array you want to copy into must have a length that can hold at least the content you want to copy. It also should not point to a part of the array you want to copy. Fixed code with both version:

char[] a = “foo”.dup;
char[] b;
b.length = “foo”.length;
b[] = “foo”;
a[0] = ‘x’;
Stdout(a ~ ” “ ~ b).newline(); // outputs “xoo foo” (WinXP & Linux)

Another thing to be aware of is if you use an associative array with char[] as key (or any other pointer/array). Accessing an element will first look at the position the hash value of your key points to. If it can find something there it will compare the stored key with the access key (using opCmp). If they match the element will be returned. If they don’t match and more elements are stored at that position it will continue to compare until something matches or return an exception/null if nothing matches.

If you store an element in an associative array the key will be stored as well, but in case of arrays or objects/structs it will just save a pointer (and array length) to it. If the value at this pointer changes you will not be able to access the element in the associative array. However, there are some cases where you can still access it but get some errors from time to time. This happens if your key variable is always at the same location in memory. Then all stored keys will point to the same address and accessing one element will always return an element if it was stored before. But this element will only be the first element at a certain hash position and if there is more then one element at that position the wrong one might get returned.

A different error will happen if the pointer to the key changes, for example if multiple threads access and store elements. Then the pointer might point to a valid location in memory, but the content might be different. If it’s a valid location you will still get the right elements most of the time but also get wrong ones if the key pointer happens to be the same for more then one element at one hash position. If the memory location is invalid you will get an access violation. This invalid case might only happen if many threads were running while storing something but only a few are running when trying to access an element.

Fixing this problem is easy. Just make sure your keys are always unchanged and reachable in memory. The .dup attribute helps with that as well. Small example:

int[char[]] map;
char[] key = “bar”.dup;map["foo"] = 32;   // good
map[key] = 16;     // not so good
key[] = “bla”;     // really bad after using it as key
Stdout.format(“{} {}”, map["foo"], ((“bar” in map)?map["bar"]:0)).newline();
// returns “32 0″
map[key.dup] = 42; // good
key[] = “123″;
Stdout.format(“{}”, map["bla"]).newline();
// returns “42″

Are you interested in reading more from CodingClues?
Then subscribe to new postings via RSS or via E-Mail.

close Reblog this comment
blog comments powered by Disqus