Some issues seen and some hacky fixes used while working with big data sets

Nov 21, 2010 at 9:53 AM
Edited Nov 21, 2010 at 9:54 AM

Thanks for this project - overall works great - I've used in in .Net4 for 4 separate command line apps :)

When working on some large data sets (around 75000 Items) I found the following problems - mostly to do with corrupt items in my input.

Hope this post helps other people...

Problems seen in collectionCreator.Create(m_dziPaths, m_dzcPath); in ParallelDeepZoomCreator.cs

  • System.ArgumentException in ParallelDeepZoomCreator.cs due to problems in my item names (some of my names included the "*" wildcard character - doh!
  • And System.IO.FileNotFoundException in ParallelDeepZoomCreator.cs due to unknown problem (no real clue why these particular folders didn't exist - it was 8 out of 75000...)

To fix these I just wrapped the call with: 

            List<String> toRemove = new List<string>();
            foreach (var c in m_dziPaths)
            {
                try
                {
                    System.Security.Permissions.FileIOPermission f = new System.Security.Permissions.FileIOPermission(
                            System.Security.Permissions.FileIOPermissionAccess.AllAccess, c);
                }
                catch (ArgumentException exc)
                {
                    toRemove.Add(c);
                    System.Diagnostics.Trace.WriteLine("INVALID PATH " + c);
                }
            }
            foreach (var c in toRemove)
            {
                m_dziPaths.Remove(c);
            }

            while (true)
            {
                try
                {
                    DZ.CollectionCreator collectionCreator = new DZ.CollectionCreator();
                    collectionCreator.Create(m_dziPaths, m_dzcPath);
                    break;
                }
                catch (System.IO.FileNotFoundException exc)
                {
                    System.Diagnostics.Trace.WriteLine("STUART - SORRY - REMOVING " + exc.FileName);
                    m_dziPaths.Remove(exc.FileName);
                }
            }

Some multithreaded problem seen in image download

The finalizer for PivotImage.cs occasionally sees IO exceptions in the File.Delete operation - the exception claims that the fie is currently open in another process.

Not sure what is causing this - my guess is it's a multithreading issue of some description.

To fix (mask) this I simple added a try catch to the finalizer:

 

        ~PivotImage()
        {
            if (m_shouldDelete == false) return;
            if (File.Exists(m_sourcePath) == false) return;

            try
            {
                File.Delete(m_sourcePath);
            }
            catch (IOException)
            {
                System.Diagnostics.Trace.WriteLine("Some clean up needed " + m_sourcePath);
            }
        }

That's it - hope it helps someone.

 

Jul 11, 2011 at 1:07 PM
Edited Jul 19, 2011 at 1:58 PM

Great article, saved me a lot of time!

 

In general, I recommend using

PauthorLog.Global.Error();
PauthorLog.Global.Message();
PauthorLog.Global.Warning();

instead of

System.Diagnostics.Trace.WriteLine();

as it provides a consistent way of logging output.