// v1.0.44.14: Added aLength to improve performance in cases where callers already know the length.
// If aLength is at its default of -1, the length will be calculated here.
// Caller must ensture that aBuf isn't NULL.
{
if (!aBuf || !*aBuf) // aBuf is checked for NULL because it's not worth avoiding it for such a low-level, frequently-called function.
return ""; // Return the constant empty string to the caller (not aBuf itself since that might be volatile).
if (aLength == -1) // Caller wanted us to calculate it. Compare directly to -1 since aLength is unsigned.
aLength = strlen(aBuf);
char *new_buf;
if ( !(new_buf = SimpleHeap::Malloc(aLength + 1)) ) // +1 for the zero terminator.
{
g_script.ScriptError(ERR_OUTOFMEM, aBuf);
return NULL; // Callers may rely on NULL vs. "" being returned in the event of failure.
}
memcpy(new_buf, aBuf, aLength + 1); // memcpy() typically benchmarks slightly faster than strcpy().
return new_buf;
}
char *SimpleHeap::Malloc(size_t aSize)
// Seems okay to return char* for convenience, since that's the type most often used.
// This could be made more memory efficient by searching old blocks for sufficient
// free space to handle <size> prior to creating a new block. But the whole point
// of this class is that it's only called to allocate relatively small objects,
// such as the lines of text in a script file. The length of such lines is typically
// around 80, and only rarely would exceed 1000. Trying to find memory in old blocks
// seems like a bad trade-off compared to the performance impact of traversing a
// potentially large linked list or maintaining and traversing an array of
// "under-utilized" blocks.
{
if (aSize < 1 || aSize > BLOCK_SIZE)
return NULL;
if (!sFirst) // We need at least one block to do anything, so create it.
if ( !(sFirst = CreateBlock()) )
return NULL;
if (aSize > sLast->mSpaceAvailable)
if ( !(sLast->mNextBlock = CreateBlock()) )
return NULL;
sMostRecentlyAllocated = sLast->mFreeMarker; // THIS IS NOW THE NEWLY ALLOCATED BLOCK FOR THE CALLER, which is 32-bit aligned because the previous call to this function (i.e. the logic below) set it up that way.
// v1.0.40.04: Set up the NEXT chunk to be aligned on a 32-bit boundary (the first chunk in each block
// should always be aligned since the block's address came from malloc()). On average, this change
// "wastes" only 1.5 bytes per chunk. In a 200 KB script of typical contents, this change requires less
// than 8 KB of additional memory (as shown by temporarily making BLOCK_SIZE a smaller value such as 8 KB
// for a more accurate measurement). That cost seems well worth the following benefits:
// 1) Solves failure of API functions like GetRawInputDeviceList() when passed a non-aligned address.
// 2) May solve other obscure issues (past and future), which improves sanity due to not chasing bugs
// for hours on end that were caused solely by non-alignment.
// 3) May slightly improve performance since aligned data is easier for the CPU to access and cache.