Home | pfodApps/pfodDevices | WebStringTemplates | Java/J2EE | Unix | Torches | Superannuation | | About Us
 

Forward Logo (image)      

How to use strings in Arduino for Beginners
Safe, Robust, Debuggable String class for Arduino

by Matthew Ford 24th November 2020 (original 1st June 2020)
© Forward Computing and Control Pty. Ltd. NSW Australia
All rights reserved.

How to do Safe, Robust and Debuggable String Processing in Arduino
using Printable and Writable non-dynamic SafeStrings.

V2 Warning!! substring( ) in SafeString V2 excludes the endIndex. Check your usage of substring when updating from V1 and remove(idx) has been rename removeFrom(idx)


Update 24th November 2020:
Add note about not using const SafeString & method arguments
Update 12th
November 2020: SafeString V2.0.0 added wrapping of char* and char[] in a SafeString and added typing shortcuts for the createSafeString... macros.
(
See Wrapping existing char arrays in a SafeString below). Fixed error in replace( )
Update 21st September 2020:
SafeString V1.0.6 added processBackspaces() and retain nextToken delimiters for later inspection. Added SafeString_Menu example
Update 8th August 2020:
SafeString V1.0.5 undef ESP8266 nl() macro
Update 25
th July 2020: SafeString V1.0.4 available fixed readBuffer arg to const char*
Update 1
st July 2020: SafeString V1.0.3 available added fix for Nano 33 BLE
Update 25
th June 2020: SafeString V1.0.2 available fixed pgmspace.h for Nano 33 IoT
Update 15
th June 2020: SafeString V1.0.1 available
Update 14th June 2020: SafeString now installable via Arduino Library Manager

Introduction

This SafeString library is designed for beginners to be a safe, robust and debuggable replacement for string processing in Arduino. This tutorial shows you how to use SafeStrings for your Arduino string processing. As concrete examples, it includes a simple sketch to check for user's commands, separated by spaces or commas, without blocking the program waiting for input. It also include an OBD data processor where the existing char* is warpped in a SafeString for processing. See the User Commands Example and the ODB data processing examples below.

Quick Start
Passing SafeStrings to methods as references and returning SafeString results
Wrapping existing char arrays in a SafeString
An OBD data processor example
Method 'static' SafeStrings and readFrom() and writeTo()
A String Tokenising and Number Conversion Example
User Commands Example
Converting between String and SafeString
Differences between Arduino Strings (WString.cpp) and SafeStrings
Controlling SafeString Error messages and Debugging
What is wrong with Arduino String processing
Conclusion

Also see Arduino For Beginners – Next Steps
How to write Timers and Delays in Arduino
Safe Arduino String Processing for Beginners
Simple Arduino Libraries for Beginners
Simple Multi-tasking in Arduino

Why should you use SafeString for your Arduino string processing.

The previous options for string processing in Arduino were C character arrays (using strcat, srtcpy etc), which are a major source of program crashes, and the Arduino standard String class (contained in WString.cpp/.h files) which leads to program failure due to memory fragmentation, excessive memory usage, and is slower and contains bugs. See What is wrong with Arduino String processing below for the details.

SafeStrings are easy to debug. SafeStrings provide detailed error messages, including the name of SafeString in question, to help you get your program running correctly.
SafeStrings are safe and robust. SafeStrings never cause reboots and are always in a valid usable state, even if your code passes null pointers or '\0' arguments or exceeds the available capacity of the SafeString.
SafeString programs run forever. SafeStrings completely avoid memory fragmentation which will eventually cause your program to fail and never makes extra copies when passed as arguments.
SafeStrings are faster. SafeStrings do not create multiple copies or short lived objects nor do they do unnecessary copying of the data.
SafeStrings can wrap existing c-strings. Wrapping existing char* or char[] in a SafeString object allows you to safely perform string manipulations on existing data.

Both Sparkfun and Adafruit advise against using the Arduino String class.
Sparkfun's comment on the String class is “The String method (capital 'S') in Arduino uses a large amount of memory and tends to cause problems in larger scale projects. ‘S’trings should be strongly avoided in libraries.
Adafruit's comment on the String class is “In most usages, lots of other little String objects are used temporarily as you perform these (String) operations, forcing the new string allocation to a new area of the heap and leaving a big hole where the previous one was (memory fragmentation). “

Unlike Arduino Strings and C character arrays, SafeStrings will NEVER crash your program due to out of memory, null pointers, buffer overrun or invalid objects.
V2.0.0 of SafeStrings adds creating SafeStrings from existing char[] and char* See Wrapping existing char arrays in a SafeString below

There is also a separate SafeStringStream library that can be manually installed from its zip file. SafeStringStream wraps a SafeString in a Stream interface for reading and writing using stream methods.

Quick Start

Installation of SafeString library:

SafeString is now available via the Arduino Library Manager, thanks to Va_Tech_EE
Start the Arduino IDE, open the library manager via the menu
Tools->Manage Libraries..
Type
SafeString into the Filter your search search bar and mouse over the SafeString entry and click on Install.

You can also install the library manually if you wish, but the Library Manager is easier.
For manual installation, download the SafeString.zip files to your computer, move it to your desktop or some other folder you can easily find. This zip file is a duplicate of what the library manager provides.
Then open the Arduino IDE and use the menu
SketchImport LibraryAdd Library to install it. Stop and restart the Arduino IDE.

Under File->Examples you should now see SafeString listed (under Examples from Custom Libraries) and under that a large number of examples sketches.
There are numerous examples provided with the SafeString library. Start with the SafeString_ConstructorAndDebugging example and then SafeStringFromCharArray, SafeStringFromCharPtr and SafeStringFromCharPtrWithSize.

Typing shortcuts for createSafeString.. macros

In SafeString V2, as well as the createSafeString( ) macro, it introduces three (3) more macros createSafeStringFromCharArray( ), createSafeStringFromCharPtr( ) and createSafeStringFromCharPtrWithSize( ) (see Wrapping existing char arrays in a SafeString below). SafeString V2 also adds the typing shortcuts cSF( ), cSFA( ), cSFP( ) and cSFPS( ) respectively for these macros. These shortcuts are #defined at the top of SafeString.h.

// define typing shortcuts
#define cSF createSafeString
#define cSFA createSafeStringFromCharArray
#define cSFP createSafeStringFromCharPtr
#define cSFPS createSafeStringFromCharPtrWithSize

You can change them as you wish or add your own defines.

A First Example Sketch

As a first example try the sketch below, SafeString_Example1.ino,

Some of the important statements in this sketch are:-

 createSafeString(msgStr, 5);
creates an empty SafeString large enough to hold 5 char (plus a terminating '\0'). Here a global SafeString, msgStr, is created, but you can also create temporary local SafeStrings inside your functions. In all cases there is never any heap fragmentation. You can also use the typing shortcut cSF( ) which does the same thing e.g.
 cSF(msgStr, 5);

 SafeString::setOutput(Serial);
set the output Stream to send the error messages and debug() to

 msgStr = F("A0");
 msgStr += " = ";
 msgStr += analogRead(A0);

builds up the string to be output.

 msgStr.debug(F(" After adding analogRead "));
will print the title followed by the details and contents msgStr (but only if setOutput( ) has been called).

Running this sketch, SafeString_Example1.ino, produces the following output

 10 9 8 7 6 5 4 3 2 1
Error: msgStr.concat() needs capacity of 8 for the first 3 chars of the input.
        Input arg was '598'
        msgStr cap:5 len:5 'A0 = '
 After adding analogRead  msgStr cap:5 len:5 'A0 = '
A0 = 

The error message,

Error: msgStr.concat() needs capacity of 8 for the first 3 chars of the input.
        Input arg was '598'
        msgStr cap:5 len:5 'A0 = '
 After adding analogRead  msgStr cap:5 len:5 'A0 = '

indicates that the msgStr is not large enough to hold the whole message. It needs to be increased to at least a capacity of 8. Note that the msgStr is still valid. If an operation cannot be performed the SafeString is just left unchanged.

Edit the createSafeString(msgStr, 5); to read increase it size to 10, i.e. SafeString_Example1a.ino
 
createSafeString(msgStr, 10);

Re-running the sketch (SafeString_Example1a.ino) after this change gives the following complete output

 10 9 8 7 6 5 4 3 2 1
 After adding analogRead  msgStr cap:10 len:8 'A0 = 604'
A0 = 604

You can also cascade these statements, like this, by using nesting brackets
 ( (msgStr = F("A0")) += " = " ) += analogRead(A0);

Finally comment out or remove the setOutput statement, i.e. SafeString_Example1b.ino
 
// SafeString::setOutput(Serial);

Re-running the sketch (SafeString_Example1b.ino) after this change gives the following output

 10 9 8 7 6 5 4 3 2 1
A0 = 487

All the debug() output has been suppressed. Without the setOutput( ) statement the error messages are also suppressed, so you should include a setOutput( ); statement while testing.

How to add text to a SafeString

There are a number of ways to add text to a SafeString. As shown above you can use = and += to add text, numbers and other SafeStrings. You can also pass an initial string value to createSafeString( ) , e.g.
 createSafeString(msgStr, 10, "A0");
If the SafeString is not large enough to hold the initial value, the result will just be an empty SafeString. If SafeString::setOutput(Serial); has been called then an error message will also be output.
Note: Only an inline "string" can be used as the initial string value for the createSafeString( ) macro.

You can also use the concat() methods to do the same. e.g. SafeString_Example1c.ino
The
clear() method clears out any existing text in the SafeString. The concat() and clear() methods return a reference to the SafeString so you can chain calls as shown in SafeString_Example1c.ino

 Serial.println(msgStr.concat(" = ").concat(analogRead(A0)));

For more format control you can use the familiar print() methods to print to a SafeString. e.g. SafeString_Example1d.ino To add the analog reading in HEX form you use
 
msgStr.print(analogRead(A0),HEX);
All the print/write methods are available for adding to SafeStrings.

The output from SafeString_Example1d.ino is

SafeString also has prefix() methods and the -= operator to add text to the front of the SafeString, e.g. SafeString_Example1e.ino

 msgStr.clear(); // remove the initial value
 msgStr.print(analogRead(A0),HEX);
 msgStr -= " = 0x";
 msgStr.prefix(F("A0"));
Serial.println(msgStr);

Other SafeString methods()

SafeString has a range of other methods to work on text :- compareTo, equals, == , != , < , <= , > , >= , equalsIgnoreCase, startsWith, endsWith, endsWithCharFrom, startsWithIgnoreCase, setCharAt, charAt, [ ] (access only), indexOf, lastIndexOf, indexOfCharFrom, substring, remove, removeBefore, removeFrom, removeLast, keepLast, replace, toLowerCase, toUpperCase, trim, toInt, toLong, binToLong, octToLong, hexToLong, toFloat, toDouble, stoken, read, readUntil, readFrom, writeTo, stoken, nextToken, processBackspaces.

The two 'unsafe' methods in V1 of this library:- readBuffer and writeBuffer, have been removed in V2 and replaced with the safe readFrom() and writeTo() methods.

Passing SafeStrings to methods as references and returning SafeString results

The Arduino compiler will issue an error or warning if you try and write an incorrect method using SafeString arguments or returns.
Note: You can also work on familiar char* to pass and return arguments and just wrap them in SafeStrings within the method for processing. See Wrapping existing char arrays in a SafeString below and the SafeString_ReadFrom_WriteTo.ino example sketch.

Enabling Arduino Compiler Warnings

To see these error and warning easily, make the following settings in the Arduino preference dialog.

Open Arduino File->Preferences and untick Show verbose output during compilation and just below that turn on Compiler Warnings: That is set Compiler Warning to something other than None, for example Default. These settings will make it easy to see the warnings/errors referred to below.

How to pass a SafeString as an argument to a method

When you define a method that has a SafeString argument you declare it as a reference, SafeString &str, e.g.
 void test(SafeString& strIn) {
 . . .
 }

If you forget to use the & when defining the function, you will get an error message like the the following

 . . . /SafeString.h: In function 'void setup()':
 . . . /SafeString.h:109:5: error: 'SafeString::SafeString(const SafeString&)' is private
     SafeString(const SafeString& other ); // You must declare SafeStrings function arguments as a reference, SafeString&,  e.g. void test(SafeString& strIn)
     ^
TestCode:30:11: error: within this context
   test(str);
           ^

The first line of the errors tells you where the function with the error was called from, e.g. test( ) was called in setup()
The third line tells you what the error was
   // You must declare SafeStrings function arguments as a reference, SafeString&, e.g. void test(SafeString& strIn)
and a little further down you will see which function call caused the error

  TestCode:30:11: error: within this context
   test(str);

That tells you that it is the test( ) method that is missing the SafeString & in its definition. Go the where you wrote the test( ) method and add a & to the end of SafeString. That is change test(SafeString .. ) to test(SafeString& … )

Declaring the method SafeString& argument as a reference means that any changes you make to it in the method will immediately update the SafeString that you passed to the method when you called it. There is only ever one copy of each SafeString and it is that one copy which gets all the changes applied to it. You could also declare the argument as a SafeString pointer, e.g. SafeString *str, but it is easier to code the method using a SafeString & reference.

Cannot write methods with const SafeString & argument

With SafeString V2 you cannot write a method with a const SafeString & as an argument. You have to use a non-const SafeString & argument as shown above This is because SafeStrings that wrap existing char arrays (see Wrapping existing char arrays, below) are 'cleaned up' in each SafeString function. This 'clean up' protects the SafeString from the results of invalid c-string operations on the wrapped char array (see Mixing SafeString and c-string methods, below), but this means the SafeString & can not be a const. If you define a method with a const SafeString & as an argument, then you compile you will get an error/warning message like

passing 'const SafeString' as 'this' argument discards qualifiers [-fpermissive]

Compiling for Uno/Mega will issue a warning (with the setting Compiler Warnings: default). While compiling for an ESP32 will give an error.

How to pass a SafeString result back from a method

Unlike the String class, SafeString does not support returning a SafeString from a method. To return results in a SafeString, you must update one of the SafeString & arguments you passed to the method. For example write a method that takes the number of loops and returns a SafeString for printing you would use SafeStringMethods.ino

void loopCountStr(int num, SafeString& str) {
  str = "loop count = ";
  str += num; // update it
  // str returned via SafeString& arg
}

The caller of this method has to provide the SafeString, as an argument, for the result to be returned in. The SafeString provided as an argument can be a global SafeString OR a local SafeString created in the code calling the loopCountStr method.

Arduino Strings let you return a String or String & as the return type from the method. It does this by coping the result and passing that copy back. SafeString only ever has one instance of each SafeString and never creates another copy. If you try to pass a SafeString back as the method return you will get either a compiler error or warning as shown below.

Here are two examples that do not work and the errors and warning they produce.
SafeStringMethod_Error_1.ino uses
SafeString loopCountStr(int num) and the compiler gives the error message
 
SafeString(const SafeString& other ); // You must declare SafeStrings function arguments as a reference, SafeString&, e.g. void test(SafeString& strIn)
That is to return a SafeString you need to pass it into the method as a reference and then update it as in SafeStringMethods.ino

SafeStringMethod_Error_2.ino uses SafeString& loopCountStr(int num) and the compiler give the warning message
 
warning: reference to local variable 'str' returned
That is the method loopCountStr is trying to return a reference (a pointer) to the local SafeString that was created on the local method stack. The method stack is completely discarded when the method returns so the data referenced is no longer valid.
You MUST heed this warning and change the code to remove it, because sometimes a sketch with this warning will appear to work, but it will ultimately fail and cause a reboot.

Wrapping existing char arrays in a SafeString

When working with existing code and libraries you often have existing char[] or char* that contains the string data you want to process or where you want to put the results. The macros createSafeStringFromCharArray( ) or cSFA( ) and createSafeStringFromCharPtr( ) or cSFP( ) and createSafeStringFromCharPtrWithSize( ) or cSFPS( ) let you wrap the data in a SafeString for processing using the existing allocated char[] without copying the data to/from a SafeString. See the example sketches SafeStringFromCharArray.ino, SafeStringFromCharPtr.ino and SafeStringFromCharPtrWithSize.ino that are included with the SafeString V2 library.

Once you have wrapped the existing char[] in a SafeString, any changes you make using SafeString methods are reflecting in the underlying wrapped char[].

The difference between createSafeString() / sSF() and the other createSafeStringFrom..() versions is that the createSafeString() macro creates a char[] (either globally or on the local stack) where as the other createSafeStringFrom..() macros all just wrap an existing char[] and do not use any extra char[] storage. Because the createSafeStringFrom...() macros wrap the existing data, any changes made via the SafeString are reflected in the existing data.

For example using cSFP to wrap the “test” string updates when operations are performed with the SafeString :-

  char str[] = "test"; // existing char[]
  Serial.println(str);  // prints  =>  test
  cSFP(sfStr,str);  // use typing short cut to create a SafeString, sfStr, that warps the existing data.
  sfStr.toUpperCase();  // changes the underlying data
  Serial.println(str);  // prints => TEST

On the other hand using cSF( ) and sfStr = str, takes a copy of the initial data and so does not change the underlying char[]

  char str[] = "test"; // existing char[]
  Serial.println(str);  // prints  =>  test
  cSF(sfStr, strlen(str)); // create as SafeString large enough to hold a copy of the existing data
  sfStr = str; // copy str to sfStr
  sfStr.toUpperCase();  // does NOT change the underlying data
  Serial.println(str);  // prints => test

createSafeStringFromCharArray( ) or cSFA( )

The macro createSafeStringFromCharArray( ) or its typing shortcut cSFA( ),wraps an existing char[..]. The char[ ] must have a specified size e.g.

char testData[25] = "initial data";

Then

cSFA(sfData, testData);  // SafeString from a char[25]

gives you safe access to the testData[25] and sfData.debug() => SafeString sfData cap:24 len:12 'initial data'
The sfData capacity is only 24 since one extra char storage is need to correctly terminate the c-string.

createSafeStringFromCharPtr( ) or cSFP( ) and createSafeStringFromCharPtrWithSize( ) or cSFPS( )

Data accessed by char* is much more common than a char[ ] with a specified size. For char* you can use either createSafeStringFromCharPtr() / sSFP( ) or createSafeStringFromCharPtrWithSize( ) / cSFPS( ) to wrap the data in a SafeString.

The macro createSafeStringFromCharPtr() or its typing shortcut sSFP( ), wraps the existing c-string pointed to by the char* and sets the SafeString capacity to it strlen(). This macro is very useful for processing incoming data. The macro checks for a NULL pointer but assumes the c-string pointed to is correctly terminated. The SafeString lets you safely process the data, but, because the SafeString's capacity is set to the strlen of the input data, you cannot extend the len. e.g.

char testData[25] = "initial data";
char *dataPtr = testData; // a pointer to the data

Then

cSFP(sfData, dataPtr);  // SafeString from a char*

gives you safe access to the initial data and sfData.debug() => SafeString sfData cap:12 len:12 'initial data'
The sfData capacity is only 12 since that is the only valid data length SafeString can determine from the char *.

The macro createSafeStringFromCharPtrWithSize() or its typing shortcut sSFPS( ), wraps the existing c-string pointed to by the char* and sets the SafeString capacity to the specified size. This macro is very useful for outputting data to a result c-string. The macro checks for a NULL pointer but assumes you are correctly specifying the number of chars the result array can hold. The result array needs to be one bigger for the terminating null. e.g.

char testData[25] = ""; // testData can hold 24 characters + terminating '\0'
char *dataPtr = testData; // a pointer to the data

Then

cSFPS(sfData, dataPtr, 24);  // SafeString from a char* that can hold (25-1) = 24 char + terminating '\0'

gives you safe access to the initial data and sfData.debug() => SafeString sfData cap:24 len:12 'initial data'

NOTE: It is very important that the underlying char[] pointed to by the char* is at least one (1) larger than the size specified to cSFPS( ).
If in doubt ALWAYS subtract 1 from the size given OR specify strlen( ).

An OBD data processor example

An car ODB data processor will be used as an example application using these SafeString wrappers. The initial attempt at this processor (local copy here) using Arduino Strings and malloc() and strdup() only ran for 30secs before the ESP32 rebooted due to heap fragmentation. That problem was solved using SafeString V1. The SafeString_OBD.ino sketch illustrates how to use the SafeString V2 wrapping macros to do the processing.

The typical code to read the OBD data (using ELMduino library) is

    myELM327.sendCommand("AT SH 7E4");       // Set Header BMS
    if (myELM327.queryPID("220101")) {      // BMS PID = hex 22 0101 => dec 34, 257
      char* payload = myELM327.payload;
      size_t payloadLen = myELM327.recBytes; 


but for this example sketch some static rawData will be used as the payload

char rawData[] = "7F22127F22127F22127F22127F221203E0:620101FFF7E71:FF8A00000000832:00240ED71713133:141713000010C14:22C12800008B005:00450B0000434B6:000019A80000187:1200200EE70D018:7B0000000003E8";
…
    char *payload = rawData;
    size_t payloadLen = strlen(rawData);


The data has the format headerBytes then one frameNumberByte:frameDataBytes repeated. A : starts each frame of data and the byte just before the : is the frame number.

The results of parsing the frames will be stored in a data struct which has an array of 9 char[20] arrays. cSFA( ) will be used to wrap the each array to store the results.

struct dataFrames_struct {
  char frames[9][20]; // 9 frames each of 20chars
};
typedef struct dataFrames_struct dataFrames; // create a simple name for this type of data
dataFrames results; // this struct will hold the results

The void processPayload(char *OBDdata, size_t datalen, dataFrames & results) method process the input data using cSFPS( ) to wrap the OBDdata using the given specified length datalen.
The
int convertToInt(char* dataFrame, size_t offset, size_t numberBytes) method is a utility method that uses cSFP( ) to wrap the dataFrame pointer setting the SafeString capacity to it current strlen so bytes can be extracted and converted to numbers.

The processing statements are :-

    char *payload = rawData;
    size_t payloadLen = strlen(rawData);
    processPayload(payload, payloadLen, results);
    printBatteryVolts(results); // plus other data

With the SafeString output set i.e. SafeString::setOutput(Serial); the sketch results are:-

frameIdx:0 SafeString frame cap:19 len:12 '620101FFF7E7'
frameIdx:1 SafeString frame cap:19 len:14 'FF8A0000000083'
frameIdx:2 SafeString frame cap:19 len:14 '00240ED7171313'
frameIdx:3 SafeString frame cap:19 len:14 '141713000010C1'
frameIdx:4 SafeString frame cap:19 len:14 '22C12800008B00'
frameIdx:5 SafeString frame cap:19 len:14 '00450B0000434B'
frameIdx:6 SafeString frame cap:19 len:14 '000019A8000018'
frameIdx:7 SafeString frame cap:19 len:14 '1200200EE70D01'
frameIdx:8 SafeString frame cap:19 len:14 '7B0000000003E8'
 hex number  hexSubString cap:14 len:4 '0ED7'
Battery Volts:379.9

Commenting out the // SafeString::setOutput(Serial); gives just

Battery Volts:379.9

Looking at the processPayload( ) method

void processPayload(char *OBDdata, size_t datalen, dataFrames & results) {
  cSFPS(data, OBDdata, datalen); // wrap in a SafeString size of undelying char[] is datalen +1 for terminating '\0'
  clearResultFrames(results);
  size_t idx = data.indexOf(':'); // skip over header and find first delimiter
  while (idx < data.length()) {
    int frameIdx = data[idx - 1] - '0'; // the char before :
    if ((frameIdx < 0) || (frameIdx > 8)) { // error in frame number skip this frame, print a message here
      SafeString::Output.print("frameIdx:"); SafeString::Output.print(frameIdx); SafeString::Output.print(" outside range data: "); data.debug();
      idx = data.indexOf(':', idx + 1); // step over : and find next :
      continue;
    }
    cSFA(frame, results.frames[frameIdx]); // wrap a result frame in a SafeString to store this frame's data
    idx++; // step over :
    size_t nextIdx = data.indexOf(':', idx); // find next :
    if (nextIdx == data.length()) {
      data.substring(frame, idx);  // next : not found so take all the remaining chars as this field
    } else {
      data.substring(frame, idx, nextIdx - 1); // substring upto one byte before next :
    }
    SafeString::Output.print("frameIdx:"); SafeString::Output.print(frameIdx); SafeString::Output.print(" "); frame.debug();
    idx = nextIdx; // step onto next frame
  }
}

The input data is wrapped for processing using

cSFPS(data, OBDdata, datalen);

Once the frame number has been found the corresponding char[] in the results struct is wrapped in SafeString using

cSFA(frame, results.frames[frameIdx]); 

The frame SafeString is then filled with the field data via

data.substring(frame, idx, nextIdx – 1);

Looking at the convertToInt() method

int convertToInt(char* dataFrame, size_t offset, size_t numberBytes) {
  // define a local SafeString on the stack for this method
  cSFP(frame, dataFrame);
  cSF(hexSubString, frame.capacity()); // allow for taking entire frame as a substring
  frame.substring(hexSubString, offset, offset + (numberBytes * 2)); // endIdx in exclusive in SafeString V2
  hexSubString.debug(F(" hex number "));
  long num = 0;
  if (!hexSubString.hexToLong(num)) {
    hexSubString.debug(F(" invalid hex number "));
  }
  return num;
}

The input char* dataFrame is wrapped in a SafeString using

  cSFP(frame, dataFrame);

As you can see above the frames are not all the same length. cSFP( ) sets the SafeString capacity to the current strlen of the c-string in the dataFrame, so SafeString will detect if you try to access bytes outside the current data and output an error if SafeString::setOutput(Serial); has been called. The SafeString V2 hexToLong() method is used to convert the hex bytes to a number and will output a message if the bytes are not valid hex.

Mixing SafeString and c-string methods

Because the createSafeStringFrom..() macros wrap existing c-string data, it is possible, but not advisable, to intermix calls to SafeString methods and unsafe c-string methods, like strcat( ). However typically you would do all your processing using SafeString methods, either in a method or within a code block { }

convertToInt() and processPayload() above are examples of doing all the processing with SafeStrings within a method. If you just have little processing to do you can do it in a small block e.g.

  Serial.println(str); // => initial data
  {
    cSFP(sfStr,str); // wrap in a local SafeString for a bit of processing.
    sfStr.replace("data","d"); // can reduce str length just cannot increase it
  }
  Serial.println(str); // => initial d

However if you do intermix SafeString method calls with c-string methods, the wrapped SafeStrings remain valid.

Before executing each method of a wrapped SafeString:-
a) the underlying c-string is re-terminate to the SafeString capacity() and
b) the strlen called to resynchronize the SafeString length to length of the underlying c-string.

Once a SafeString has been created its capacity cannot be changed. It has static size. So even if the c-string pointed to has extra space and more characters are added later, the SafeString will truncate the string to the SafeString's original capacity to prevent possible memory overruns. SafeString also resynchronizes its length to the current strlen( ) to allow for external changes.

For example

  char testData[50] = "initial data"; // space for more chars here
  char* str = testData;
  Serial.print(F("str => "));Serial.println(str);
  cSFP(sfStr, str); // create from a char* capacity is set to initial strlen
  sfStr.debug(F("Initial cSFP( ) contents :"));
  strcat(testData, "123");   // access the c-string, str, directly
  sfStr.debug(F("After strcat :")); // no change is capacity
  Serial.print(F("str => "));Serial.println(str); // underlying str was reterminated at capacity
  str[7] = 0; // truncate underlying str
  sfStr.debug(F("After str[7] = 0")); // new length picked up
  Serial.print(F("str => "));Serial.println(str); // underlying str was reterminated at capacity

Produces the output the following output. Note how the strcat of "123" is lost when the sfStr SafeString re-terminates the underlying c-string to the original capacity. But the new string length after str[7] = 0; is picked up.

str => initial data
Initial cSFP( ) contents : sfStr cap:12 len:12 'initial data'
After strcat : sfStr cap:12 len:12 'initial data'
str => initial data
After str[7] = 0 sfStr cap:12 len:7 'initial'
str => initial

So after performing c-string operations on the underlying char *, SafeStrings remain safe, but it intermix c-string methods and SafeString methods is not advised as the results may not be what you expect.
Also see the SafeStringFromCharArray.ino, SafeStringFromCharPtr.ino and SafeStringFromCharPtrWithSize.ino example sketches included with the SafeString library V2.

Memory Usage versus Processing Time

A final point on the createSafeStringFrom...() macros. As discussed above each time a SafeString method is called on a SafeString created with one of these macros, the current length of the underlying char[] is checked using strlen( ) and the SafeString.length() updated. This picks up any changes made directly on the underlying char[]. The more times methods are called and the larger the underlying c-string is the more processing time this takes. You can trade this processing cost for memory usage by creating a local SafeString, using createSafeString() or cSF() and copying the data to it. e.g. instead of

  void processPayload(char *OBDdata, size_t datalen, dataFrames & results) {
    cSFPS(data, OBDdata, datalen); // wrap in a SafeString

use

  void processPayload(char *OBDdata, size_t datalen, dataFrames & results) {
    cSF(data, datalen); // create a local SafeString large enough to hold the incoming data
    data = OBDdata; // copy the OBDdata to the SafeString,  = does a copy


Then calling methods on the data SafeString created using cSF( ), do not call strlen( ), because only the SafeString has access to the data. This saves processing time at the expense of the extra stack space used for the data char[] the cSF() macro adds. Also any changes made via the cSF() SafeString methods do not effect the input OBDdata.

Method 'static' SafeStrings and readFrom() and writeTo()

The example sketch SafeString_ReadFrom_WriteTo.ino (included with the SafeString library), illustrates creating method 'static' SafeStrings which keep their value across method calls. That example also covers the readFrom() and writeTo() SafeString methods. SafeString V2 removes readBuffer(),writeBuffer() and adds the safer readFrom(), writeTo() methods

Method 'static' SafeStrings are created by wrapping 'static' char[]. e.g.

size_t processInput(char* text, size_t txtIdx, char* outBuf, size_t outBufSize ) {
  // these two method static's keep their values between method calls
  static char inputArray[MAX_CMD_LENGTH + 2]; // +1 for delimiter, +1 for terminating '\0' keep this array from call to call
  static char tokenArray[MAX_CMD_LENGTH + 2]; // +1 for delimiter, +1 for terminating '\0' keep this array from call to call
  cSFA(input, inputArray); // create a 'static' SafeString string by wrapping the static array inputArray
  cSFA(token, tokenArray); // create a 'static' SafeString string by wrapping the static array tokenArray


The rest of the example covers reading small parts of large buffer into the input SafeString, via readFrom(), to parse valid commands. Any valid command found is then written to a very small output buffer, via writeTo(). The output buffer is very slowly printed out. As space becomes available in the output buffer the rest of the command is written to it. All of this processing is non-blocking. The loop() code continues to run at maximum speed.

A String Tokenising and Number Conversion Example

The SafeString_stoken_Example.ino sketch illustrates extracting numbers from a CSV (comma separated values) input and converting the valid fields to numbers. Also see the SafeString_stoken example included with the SafeString library
Commenting out the
 
// SafeString::setOutput(Serial);
to remove all the SafeString::Output.print( ) and debug( ) output gives the following

 10 9 8 7 6 5 4 3 2 1
 The input is 23.5, 44a ,, , -5. , +.5, 7a, 33,fred5, 6.5.3, a.5,b.3

Fields with numbers are:-
Field 1 : 23.50
Field 5 : -5.00
Field 6 : 0.50
Field 8 : 33.00

Note that after finding and processing one field the rest of the loop() is run, so the rest of you program remains 'alive'. If you where using the Arduino String class there would be two problems, i) there is no tokenize method and ii) fields like 44a would be interpreted as valid numeric fields.

User Commands Example

V1.0.6 adds a SafeString_Menu.ino example file. That sketch in the SafeString examples directory, extends the example below to provide a template for a non-blocking user menu with commands that have numeric arguments. The setEcho() and processBackspaces() methods accommodate using a terminal program.

You often need a way to enter commands to control a sketch, while it is running. SafeString_ReadCmdsTimed.ino provides that functionality. The standard Arduino Stream methods, readString(), readStringUntil(), etc, do not support this because they pause the whole program while waiting for the user input. They are completely unsuitable when the rest of the sketch is controlling something in 'real time', e.g. a stepper motor.

The basic sketch outline is:-

void loop() {
  input.read(Serial);

  if (input.nextToken(token, delimiters)) { // process at most one token per loop    
    if (token == commandStr) {
      // command action here
    } else if (token == anotherCmd) {
        . . .   test for other cmds
       . . .
    }// else  // not a valid cmd ignore
  }
  // rest of your sketch code here is executed without delay every loop()
}

The process is, read some input, check if there is a complete token (terminated by a delimiter), if there is check it against known commands then execute rest of the loop code, and repeat. There are no delays waiting for user input in this process, so the loop() runs at maximum speed.

The SafeString_ReadCmds.ino example sketch implements this process. Although this sketch will accept an unlimited number of characters and commands in one line, the size of the input and tokens need only be 1 more then the longest command (to allow for the delimiter), so the sketch uses very little RAM memory.

SafeString_ReadCmds.ino works well as long as the Arduino monitor is set to send Newline or Carriage Return or Both NL & CR line endings. However if the Arduino monitor is set to No line ending, or the input is being sent character by character from a terminal program, then the last command seems to be ignored. This because the nextToken( ) is waiting for a delimiter to terminate the command. nextToken() can not tell if the characters stop are a valid command or just part of the input text stopping. It needs to the next character OR it needs to decide no more characters are coming.

To decide no more characters are coming a timer is used. A 0.1sec timer is started whenever some input is received. If the timer times out, then there has been not any input for 0.1secs and the sketch assumes the input has ended and adds a dummy delimiter so that last characters received are terminated and a token is returned for processing. SafeString_ReadCmdsTimed.ino includes that timer.

The revised, timed, sketch outline is:-

void loop() {
  if (input.read(Serial)) {
    timeout.start(100); // start/restart the 0.1sec timer every time something is read
  }

  if (input.nextToken(token, delimiters)) { // process at most one token per loop    
    if (token == commandStr) {
      // command action here
    } else if (token == anotherCmd) {
        . . .   test for other cmds
       . . .
    }// else  // not a valid cmd ignore
  }
  if (timeout.justFinished()) {
    input += delimiters[0]; // nothing received for 0.1secs, terminated last chars so token will be processed.
  }
  // rest of your sketch code here is executed without delay every loop()
}

Looking at SafeString_ReadCmdsTimed.ino in more detail, the top of the sketch initializes the start, stop and reset commands as SafeStrings and defines two SafeStrings to handle the input and the tokens.

The loop() code is

The loopCounter counts very quickly because the loop() always runs as fast at it can. So the loopCounter value is only printed every few secs. (i.e. every multiple of 10,000)

Here is some sample output for the command line
stop,reset start

Using SafeString_ReadCmdsTimed.ino you can now enter commands in the Arduino Monitor with the No line ending setting and they will be executed. If you are typing commands, a character at a time, from a terminal connection, then you will want to increase the TIMEOUT_MS setting to say 500 to 1000 (0.5 to 1.0 secs) depending on how fast you type. Otherwise if you are a bit slow typing in the whole command, the sketch will insert a terminator in the middle of your command and it will not be recognized.

Converting between String and SafeString

To convert an Arduino (C++) String to a SafeString, create as SafeString of sufficient size and copy the String to the SafeString.

  String arduinoStr("this is a string");
  Serial.print(" Arduino String : "); Serial.println(arduinoStr);
  createSafeString(sfStr, arduinoStr.length()); // create a SafeString of sufficient size
  sfStr = arduinoStr.c_str(); // copy arduinoStr to the SafeString
  sfStr.debug();


This gives the output, (if SafeString::setOutput(Serial); has been called)

 Arduino String : this is a string
SafeString sfStr cap:16 len:16 'this is a string'


To convert from a SafeString back to an Arduino (C++) String just use

  arduinoStr = str.c_str();

Differences between Arduino Strings (WString.cpp) and SafeStrings

The major difference is that the capacity of a SafeString is fixed at creation and does not change, where as Arduino Strings can resize dynamically using more memory then you expected and fragmenting the heap-space. SafeStrings are also more robust and provide detailed error messages when an operation cannot be preformed.

SafeStrings are Always Valid

SafeString is always left unchanged and in a valid state even if the operation cannot be preformed. On the other hand, Arduino Strings loose all the existing data if the dynamic resize fails and some operations result in invalid String states. e.g.
 String str = "abc";
 str += '\0';
or
 str[2] = '\0';
result in an invalid Arduino String.

In SafeString, s_str += '\0'; and s_str.concat('\0'); errors are caught and ignored and generate an error message. While str[ .. ] = is not supported because the error cannot be caught. In SafeStrings use setCharAt( ) instead.
 str.setCharAt(2,'a');

SafeStrings provide consistent operator syntax

SafeStrings do not support the + operator, because it creates temporary Strings objects and is problematic in its application. For example
 String testStr1;
 testStr1 = "a" + 10; // this compiles and runs
 testStr1 = "a" + 10 + "b"; // this does NOT compile

With SafeStrings use the += operator or the concat() method, i.e.
 createSafeString(testStr, 20);
 testStr = "a";
 testStr += 10;
 testStr += "b";

SafeStrings have Non-blocking Read methods

SafeStrings has two non-blocking read methods, read() and readUntil(), which allow you to look for user input without halting the rest of your program. See the User Commands Example above and the SafeString_readUntil example sketch provided with the library.

SafeStrings have stoken( ) and nextToken() methods

SafeString has two tokenising methods. stoken() is the low level method which can be used to split a string into tokens separated by specified delimiters or extract tokens that only contain characters from a specified valid set of characters. nextToken() is a high level method that will remove a delimited token from the SafeString and and return it for processing. See the User Commands Example above and the SafeString_stoken example sketch provided with the library.

SafeStrings implement the Print interface

As shown in the previous examples, SafeStrings implement the Print interface so you can print( ) to a SafeString to add text to it and use all the familiar formatting features available with print( ).

Differences in string to number conversions

There are noticeable differences between SafeStrings and Arduino Strings for string to number conversions. In Arduino Strings, if the conversion fails the method returns 0 or 0.0 , so you cannot tell if the conversion failed or the number was actually 0 Also the Arduino String number conversions do not look at the whole string but just stop and return a result as soon as the next character is invalid. For example for Arduino String toInt() on “5a” returns 5 as the answer, and for toFloat(), “5.3.5” returns 5.3.

The SafeStrings conversion methods, on the other hand, return true or false to indicate if a valid number was found and if one is found then the argument variable is updated. For example (also see the SafeStringToNum example sketch provided with the library.)

 createSafeString(str, 7, " 5.3.6 ");
 float f = 0.0;
 bool rtn = str.toFloat(f);
will return false and leave f unchanged

 createSafeString(str, 7, " 5 ");
 long ans = 0;
 bool rtn = str.toLong(ans);
will return true and set ans to 5.
Arduino String's
toInt() method actually converts the string to a long

Differences in returns type/values

SafeStrings returns bool (true or false) for logical returns, where as Arduino's Strings return unsigned char . In almost all cases there is no functional difference.

SafeStrings returns size_t from indexOf type methods and you need to test for < str.length() to determine if the search succeeded. size_t is the type C/C++ uses for array indices and so it is the 'correct' type to return for an index. size_t is an unsigned type an so is always >= 0. Arduino Strings returns an int for index searches and uses -1 to indicate the search failed. When you move from using Arduino Strings to SafeStrings you will need to change your tests for the return of methods that return a index to test of < str.length() to see if the search succeeded. See the SafeStringIndexOf example sketch provided with the library for code examples.

Controlling SafeString Error messages and Debugging

SafeString has extensive error checking build in. To enable the error messages, you need to tell SafeString where to send them to using the setOutput( ) method
 SafeString::setOutput(Serial);

SafeString Error messages

Then if your code tries to exceed the capacity of a SafeString, you will get an error message sent to Serial e.g.
 stringOne = "abc";
 stringOne += "123456789";

Error: stringOne.concat() needs capacity of 12 for the first 9 chars of the input.
      Input arg was '123456789'
      stringOne cap:8 len:3 'abc'


The error message tells you which operation failed, why it failed and a summary of the current state of the SafeString object and it current contents. This is enough for you to find and fix the problem. In this case by increasing the size of stringOne to 12 in the createSafeString( ) macro.

You can use the setVerbose( ) method to turn off/on the printing of the current string contents in error messages at various points in your sketch.
 SafeString::setVerbose(false);
 stringOne = "abc";
 stringOne += "123456789";

now outputs a more compact error message.

Error: stringOne.concat() needs capacity of 12 for the first 9 chars of the input. --- stringOne cap:8 len:3

SafeString Debug statements

You often want to debug your program by outputting the value of a SafeString at various points in the program. The debug() method does that for you. Once you have told SafeString where to send error/debugging messages, you can use the debug( ) method to output the current state of a SafeString. e.g.
 stringOne.debug();

SafeString stringOne cap:8 len:3 'abc'

To get a compact output without the string's contents use
 stringOne.debug(false);

SafeString stringOne cap:8 len:3

You can also add a title, either a " ... " or an F( ) string or another SafeString, to the debug output e.g.
 stringOne.debug(F(" After stringOne = \"abc\";") );

 After stringOne = "abc"; stringOne cap:8 len:3 'abc'

Turning off SafeString Error messages and debug output

You can use the statement
 SafeString::turnOutputOff();
to turn off all error messages and debug() output from then on. Use
 SafeString::setOutput(Serial);
to turn it back on again.

Using SafeString::Output for debugging messages

You can print debugging messages using SafeString's Output. e.g.
 SafeString::Output.println(" string debug msg.. ");

If SafeString::setOutput( ) has been called then SafeString::Output.print( will print to that Stream. Otherwise the output is discarded. This lets you add string debugging messages that will disappear when you turn off the other SafeString error messages and debugging output.

Removing All Error Messages Code

The SafeString Error messages add a noticeable amount of code to your program. If you are running out of program space, you can quickly remove all the error messages code by commenting out
// #define SSTRING_DEBUG
at the top of the SafeString.h file and recompiling your sketch. This will remove all the error message handling but will keep all the error checks so that your code is still safe and robust.

The debug() method and SafeString::Output.print( methods are still available with //#define SSTRING_DEBUG commented out, but the name of the variable is not longer included in the debug() output. i.e.
 stringOne.debug(F(" After stringOne = \"abc\";") );

 After stringOne = "abc"; cap:8 len:3 'abc'

Debug and Error Message Print Delays

Sending output to Serial can delay the rest of you program from running. At 9600baud it takes about 1mS per char to send the output. Once the outgoing Serial buffer fills up, your program stops and waits for some chars to be sent to free up space in the buffer so it can send the rest of the output. See Simple Multitasking Arduino for the details and for how to add larger output buffer. You can reduce the amount of SafeString output by using setVerbose(false); This gives compact error messages which omit the printing the contents of the SafeString. debug(..,false); does the same for debug() statements

What is wrong with Arduino String processing.

Neither C character arrays nor Arduino Strings are recommended for string processing. C character arrays are a major source of programming bugs in C/C++ programs. With C character arrays there are no checks against to over filling them, thus writing over the memory being used by another variable. While both Both Sparkfun and Adafruit advise against using the Arduino String class.
Sparkfun's comment on the String class is “The String method (capital 'S') in Arduino uses a large amount of memory and tends to cause problems in larger scale projects. ‘S’trings should be strongly avoided in libraries.
Adafruit's comment on the String class is “In most usages, lots of other little String objects are used temporarily as you perform these (String) operations, forcing the new string allocation to a new area of the heap and leaving a big hole where the previous one was (memory fragmentation). “ See this issue for a practical example of memory fragmentation.
(local copy here)

Adafruit supplies a very neat image of this problem.

Adafruit suggests using String.reserve(), but this does prevent heap fragmentation, only reduces it, and does nothing about the excess memory usage.

As well as heap fragmentation, the lots of little String objects created mean lots of extra time spent creating creating those objects and then copying the values to the next little object. For example string.replace(“abcd”,”efgh”); creates two new String objects, one each for “abcd” and “efgh”, before passing the String objects to the replace() method. Creating these extra String objects makes the program run slower due to the String object creation and the copying of the char arrays, “abcd” and “efgh”, into the new String objects. SafeStrings avoid all these problems

Adafruit also has this advice
Use Local Variables – Every function call creates a stack frame that makes the stack grow toward the heap. Each stack frame will contain:

This data is usable within the function, but the space is 100% reclaimed when the function exits!
Adafruit advises

Some of the Errors in Arduino's WString code

The current WString code supplied with Arduino contains a number of bugs which cause programs using it to either fail or give un-expected results. For example

  String a;
  a = "12345";
  Serial.println(a);
  a += a;
  Serial.println(a);

causes an Arduino UNO to continually reboot.


Another example is

  String aStr;
  aStr = "";
  aStr.concat('\0');
  Serial.print("aStr .length():");Serial.println(aStr .length());
  Serial.print("strlen(aStr.c_str()):");Serial.println(strlen(aStr .c_str()));

which outputs

aStr.length():1
strlen(aStr.c_str()):0

That is the String object length() is no longer the same as the strlen() of the underlying char buffer. The Arduino String class has a number of these types of errors.

In contrast using SafeStrings gives outputs of
For the first example

12345
1234512345

And for the second example gives the error message

Error: aStr.concat() of '\0'

Using SafeStrings, the first example works as expected and the second example gives a detailed error message contain the name of the SafeString, aStr , involved, and leaves the original value of aStr unchanged.
Most, but not all of, WString's bugs could be fixed but that would not fix the basic problems with Arduino Strings dynamic memory allocation and lots of object creations and data copying.

In number conversion, Arduino String.toInt() can give odd errors. For example

String a_str("123456789012345");
Serial.println(a_str.toInt());


Outputs

-2045911175


While SafeString toLong(num) returns false indicating the string is not a valid long and the num argument is unchanged. See the SafeStringToNum example sketch provided with the library.


Conclusion

String processing using C character arrays (using strcat, srtcpy etc) is a major source of program crashes. While the Arduino standard String class (contained in WString.cpp/.h files) leads to program failure due to memory fragmentation and is slower and contains bugs. The SafeString library presented here solves these problems by providing a safe, robust and debuggable means for doing string processing in Arduino.

SafeStrings are easy to debug. SafeStrings provide detailed error messages, including the name of SafeString in question, to help you get your program running correctly.
SafeStrings are safe and robust.
SafeStrings never cause reboots and are always in a valid usable state, even if your code passes null pointers or '\0' arguments or exceeds the available capacity of the SafeString.
SafeString programs run forever.
SafeStrings completely avoid memory fragmentation which will eventually cause your program to fail and never makes extra copies when passed as arguments.
SafeStrings are faster.
SafeStrings do not create multiple copies or short lived objects nor do they do unnecessary copying of the data.
SafeStrings can wrap existing c-strings. Wrapping existing char* or char[] in a SafeString object allows you to safely perform string manipulations on existing data.

The SafeString library provide numerous examples and as well as simple practical sketch which process user commands while keeping running the sketch at maximum speed, SafeString_ReadCmdsTimed.ino


Forward home page link (image)

Contact Forward Computing and Control by
©Copyright 1996-2020 Forward Computing and Control Pty. Ltd. ACN 003 669 994