Index documents are submitted to a Solr server in XML format. Third-party media content extractors might produce characters that cannot be converted to XML. As a result, the entire documents batch cannot be indexed. An error similar to the following can be found in log records:
Exception: System.ArgumentException
Message: '◻', hexadecimal value 0x1F, is an invalid character.
Source: System.Xml
at System.Xml.XmlEncodedRawTextWriter.WriteElementTextBlock(Char* pSrc, Char* pSrcEnd)
at System.Xml.XmlEncodedRawTextWriter.WriteString(String text)
at System.Xml.XmlWellFormedWriter.WriteString(String text)
at System.Xml.Linq.ElementWriter.WriteElement(XElement e)
at System.Xml.Linq.XElement.WriteTo(XmlWriter writer)
at System.Xml.Linq.XNode.GetXmlString(SaveOptions o)
at SolrNet.Commands.AddCommand`1.ConvertToXml()
at SolrNet.Commands.AddCommand`1.Execute(ISolrConnection connection)
at SolrNet.Impl.LowLevelSolrServer.SendAndParseHeader(ISolrCommand cmd)
at Sitecore.ContentSearch.SolrProvider.SolrBatchUpdateContext.AddRange(IEnumerable`1 group, Int32 groupSize)
at Sitecore.ContentSearch.SolrProvider.SolrBatchUpdateContext.Commit()
at Sitecore.ContentSearch.AbstractSearchIndex.PerformUpdate(IEnumerable`1 indexableInfo, IndexingOptions indexingOptions)
at Sitecore.ContentSearch.AbstractSearchIndex.Update(IEnumerable`1 indexableInfo)