Note: This reply is mostly helpful if you work with legacy Matlab code, or have colleagues who primarily know Matlab, and you have to work with string data.
In general I agree with you; Matlab's age and origins show through in some warty ways, and one of them is string processing. Whenever I have to process anything that's not simple CSV or Excel, I use Python. (For XML, there's Perl Xpath command line tool, which has come in pretty handy for simple XML extraction.)
That said, however, the Statistics Toolbox has classes dataset, nominal, and ordinal that take a huge amount of the pain out of working with string data. Dataset lets you mix column types and refer to them by name, and lets you name rows if you like. I think it's similar to a dataframe in R. Nominal and ordinal are efficient representations for string columns. They are a workaround for Matlab's lack of a runtime string pool, but also are fast and small.
In general I agree with you; Matlab's age and origins show through in some warty ways, and one of them is string processing. Whenever I have to process anything that's not simple CSV or Excel, I use Python. (For XML, there's Perl Xpath command line tool, which has come in pretty handy for simple XML extraction.)
That said, however, the Statistics Toolbox has classes dataset, nominal, and ordinal that take a huge amount of the pain out of working with string data. Dataset lets you mix column types and refer to them by name, and lets you name rows if you like. I think it's similar to a dataframe in R. Nominal and ordinal are efficient representations for string columns. They are a workaround for Matlab's lack of a runtime string pool, but also are fast and small.