Copying Strings Safely in C
This technical guide analyzes safe string copying methods in C to prevent stack buffer overflows and adjacent variable corruption while using strcpy.
Contents
The Problem
C provides no inherent protection against manipulating memory beyond the boundaries allocated for a specific variable. The example below demonstrates how data stored in an adjacent variable can be corrupted using the strcpy function, which can inadvertently overwrite a string’s null terminator and write directly into the memory space of another variable:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char s1[] = "Daniel";
const char s2[] = "Francis";
printf(
"s1 = \"%s\" (strlen: %ld, sizeof: %ld)\n"
"s2 = \"%s\" (strlen: %ld, sizeof: %ld)\n",
s1, strlen(s1), sizeof s1,
s2, strlen(s2), sizeof s2
);
const char *const src = "Christopher";
strcpy((char *)s1, src);
printf(
"s1 = \"%s\" (strlen: %ld, sizeof: %ld)\n"
"s2 = \"%s\" (strlen: %ld, sizeof: %ld)\n",
s1, strlen(s1), sizeof s1,
s2, strlen(s2), sizeof s2
);
return EXIT_SUCCESS;
}The two local string variables, s1 and s2, are allocated on the stack. When compiling with GCC, these variables are stored contiguously in memory, though this layout may vary across different compilers and architectures. The memory layout is structured as follows, where 0 signifies the string’s null terminator:
Daniel0Francis0
^s1 ^s2By copying a longer string into s1 using strcpy, both the null terminator of s1 and the contents of s2 are overwritten with the characters »opher« and a new null terminator. Crucially, the array addresses of s1 and s2 remain unchanged and still point to their original memory locations. The resulting memory layout after the strcpy call is structured as follows:
Christopher0is0
^s1 ^s2The strlen function returns the number of characters in a string, excluding its null terminator, whereas the sizeof operator determines the total size of the array in bytes (where a char is guaranteed to be exactly one byte). Writing past an array’s boundary does not increase its allocated size, but it alters the perceived length of the string, as demonstrated by the program’s output:
s1 = "Daniel" (strlen: 6, sizeof: 7)
s2 = "Francis" (strlen: 7, sizeof: 8)
s1 = "Christopher" (strlen: 11, sizeof: 7)
s2 = "pher" (strlen: 4, sizeof: 8)The Solution
Such vulnerabilities can be prevented by verifying the available space in the destination buffer prior to performing the write operation. The buffer size of a destination variable can be determined using the sizeof operator (sizeof <variable>), provided the array’s bounds are visible within the current scope:
char s1[] = "Daniel";
const char s2[] = "Francis";
const char *src = "Joseph"; // Source fits into destination, copy.
// `const char *src = "Richard";` // Source is too long, do not copy.
if (strlen(src) < sizeof s1)
strcpy((char *)s1, src);
else
printf("String variable `s1` is too short.\n");
printf(
"s1 = \"%s\" (strlen: %ld, sizeof: %ld)\n",
s1, strlen(s1), sizeof s1
);Alternatively, the bounds-checked _s versions of standard string functions, such as strcpy_s, can be used to enforce size parameters and runtime error checking. However, these C11-introduced extensions (Annex K) are poorly supported across major platforms (e. g., absent in glibc). To enable them where available, the macro #define __STDC_WANT_LIB_EXT1__ 1 must be defined before including string.h.