What's the SGP focuser UI doing?

So I’m satisfied that my ASCOM focuser driver is working as intended and while the autofocus routine feels slow (and may be delayed at startup, etc), digging through my log and comparing the timing of the focus images, the overall behavior appears to be responding at appropriate speeds. However, when sending a step from the user interface, while the GUI may update to show the start of the move, the motor moves and stops and the GUI stays frozen for a minute or so before finally updating to the current position of the motor.

This is pretty difficult to analyze, given it’s hard to tell exactly where in the log a manual move was made. However, looking closely at the autofocus run logs, I’m seeing evidence of what I would consider a big no-no for a control system like this. It would appear that SGP is attempting multi-threaded access to the focuser driver, since I’m seeing log entries from a second command (e.g. Link query) in between log entries from a first command (e.g. position query). Either that, or the ASCOM log is failing to capture a handful of events around this area.

I’ve cleaned up a chunk of my driver log around the start of a focusing sequence. Other than a couple of “impossible” log entries around some of the apparently out-of-sequence (multithreaded) calls, I’ve limited the entries to only those from specified ASCOM functions (those used by SGP). I’ve commented where things are in the process for reference. As you can see from this snippet, while some commands follow a logical order, there are things occurring in one loop that don’t occur the same way in the next, indicating that it’s not following the exact same code sequence. Again this would make sense if there were two threads making calls inside each other’s loops.

22:29:44.628 Temperature Get           24.3

22:29:45.266 Position                  P = 5598

22:29:46.370 Position                  P = 5598

22:29:47.442 Position                  P = 5598

22:29:48.514 Position                  P = 5598

22:29:48.941 TempCompAvailable Get     False

22:29:48.942 TempCompAvailable Get     False

22:29:48.978 IsMoving Get              False

22:29:49.026 Position                  P = 5598

22:29:49.090 Position                  P = 5598

22:29:49.091 MaxStep Get               10000

22:29:49.123 IsMoving Get              False

22:29:49.138 Move                      6758                           // Move to start plus hysteresis offset

22:29:49.986 IsMoving Get              True

22:29:50.114 Direction Get             -1                            //Impossible query to an internal function that's only called by IsMoving or Position queries

22:29:52.949 Link Get                  True

22:29:55.171 Direction Get             -1                        //Impossible query to an internal function that's only called by IsMoving or Position queries

22:30:02.950 Link Get                  True

22:30:04.692 Temperature Get           24.3

22:30:05.234 Position                  P = 6758 - Moving

22:30:05.298 Position                  P = 6758                 // driver detects motion stopped here...

22:30:05.314 IsMoving Get              False

22:30:05.362 Position                  P = 6758

22:30:05.410 Position                  P = 6758

22:30:05.410 MaxStep Get               10000

22:30:11.282 IsMoving Get              False

22:30:11.285 Target Position           T = 6398                 // Out of sequence log entry (part of Move command)

22:30:12.952 Link Get                  True

2:30:16.322 Move                      6398                            // Go to first position (frame 0)  -- post-hysteresis move.

22:30:16.338 IsMoving Get              True

22:30:16.802 Position                  P = 6736 - Moving

22:30:16.850 Position                  P = 6732 - Moving

22:30:16.850 MaxStep Get               10000

22:30:17.010 Position                  P = 6714 - Moving

22:30:17.010 MaxStep Get               10000

22:30:17.170 Position                  P = 6690 - Moving

22:30:17.170 MaxStep Get               10000

22:30:22.953 Link Get                  True

22:30:24.745 Temperature Get           24.3

22:30:27.362 IsMoving Get              False                    // driver detects motion stopped here...

22:30:32.379 MaxStep Get               10000

22:30:32.401 Direction Get             0                        //Impossible query to an internal function that's only called by IsMoving or Position queries

22:30:32.954 Link Get                  True

22:30:42.514 Position                  P = 6398

22:30:42.562 Position                  P = 6398

22:30:42.578 IsMoving Get              False

22:30:42.626 Position                  P = 6398

22:30:42.626 MaxStep Get               10000

22:30:42.955 Link Get                  True

22:30:43.473 Direction Get             0

22:30:44.821 Temperature Get           24.3

22:30:48.514 Direction Get             0                        //Impossible query to an internal function that's only called by IsMoving or Position queries

22:30:52.956 Link Get                  True

22:30:53.601 Position                  P = 6398

22:30:53.617 IsMoving Get              False

22:30:53.665 Position                  P = 6398                  // Assumed start of 8 second exposure (plus download time)

22:30:54.753 Position                  P = 6398

22:30:55.826 Position                  P = 6398

22:30:56.898 Position                  P = 6398

22:30:57.986 Position                  P = 6398

22:30:59.026 Position                  P = 6398

22:31:00.097 Position                  P = 6398

22:31:01.169 Position                  P = 6398

22:31:02.241 Position                  P = 6398

22:31:02.957 Link Get                  True

22:31:03.314 Position                  P = 6398

22:31:04.369 Position                  P = 6398                  // End of 8 second exposure (storage time)

22:31:08.913 Position                  P = 6398

22:31:09.178 Temperature Get           24.3

22:31:09.954 IsMoving Get              False

22:31:09.969 Move                      6198                         // Second move

22:31:10.065 IsMoving Get              True

22:31:11.025 Position                  P = 6294 - Moving

22:31:11.105 IsMoving Get              True

22:31:11.153 Position                  P = 6272 - Moving

22:31:11.201 Position                  P = 6262 - Moving

22:31:11.202 MaxStep Get               10000

22:31:11.361 Position                  P = 6234 - Moving

22:31:11.361 MaxStep Get               10000

22:31:11.521 Position                  P = 6206 - Moving

22:31:11.522 MaxStep Get               10000

22:31:11.697 Position                  P = 6198             // driver detects motion stopped here...

22:31:11.697 MaxStep Get               10000

22:31:11.857 Position                  P = 6198

22:31:11.857 MaxStep Get               10000

22:31:12.081 Position                  P = 6198

22:31:12.129 Position                  P = 6198

22:31:12.145 IsMoving Get              False

22:31:12.193 Position                  P = 6198

22:31:12.959 Link Get                  True

22:31:13.249 Position                  P = 6198                   // Assumed start of 2nd 8 second exposure (plus download time)

22:31:14.434 Position                  P = 6198

22:31:15.505 Position                  P = 6198

22:31:16.577 Position                  P = 6198

22:31:17.649 Position                  P = 6198

22:31:18.737 Position                  P = 6198

22:31:19.809 Position                  P = 6198

22:31:20.881 Position                  P = 6198

22:31:21.953 Position                  P = 6198

22:31:22.960 Link Get                  True

22:31:23.025 Position                  P = 6198                     // End of 2nd 8 second exposure

22:31:27.185 Position                  P = 6198

22:31:28.177 IsMoving Get              False

Thanks for any comments.

Beo

Yes, there are multiple threads accessing devices. Generally 1 to update the UI and another to monitor things when moving.

I would generally recommend putting a mutex around any operations that need to be atomic (such as a query/response if your hardware can’t support overlapping I/O). Also it’s generally a good idea to cache values in your driver if your hardware can’t handle a high amount of traffic. Although SGP is much less aggressive than it used to be in this regard.

You can see a sample mutex in action in this focus driver I wrote years ago around the only piece that communicates with the serial port (line 122):

As well as the “throttle” DoUpdate. If I were to write this again I’d certainly make some changes here…but the concept is still the same.

Thanks,
Jared

Thanks Jared:

Yeah, I was already planning to add a mutex to address the overlapping issue. Still not sure that explains (or will fix) the reason that the UI isn’t updating/responding in a reasonable time frame though. I’d started the post specifically for that question but by the time I’d finished it the log was showing me the multi-threading issue.

Thanks,

Beo

So adding the mutex around the serial communication (but not each function in the interface) seems to have resolved the issue with the GUI not being responsive. However, the focus routine still seems to have a problem at the last position where it regularly sits there at 100% complete in the auto focus dialog, with the “image preview open” message and a scrolling progress bar on the main window and only the cancel button enabled. I end up having to abort and retry, which ends up leaving the focuser at some undesired position.

Thanks,

Beo

Have you ran the driver through Conform?

Thanks,

Yeah and other than it originally complaining because I had the temperature sensor as something optional and threw a “not connected” exception rather than a “not supported” exception if it wasn’t there, everything’s fine. Changing that passes everything (since I don’t have one of my temperature sensors here to test with, but I know that works fine too).

Startup ASCOM Device Conformance Checker - 32bit mode
Startup ASCOM Platform 6.3 6.3.0.2831

ConformanceCheck ASCOM Device Conformance Checker Version 6.2.59.0, Build time: 1/9/2017 8:44:30 AM
ConformanceCheck Running on: ASCOM Platform 6.3 6.3.0.2831

ConformanceCheck Driver ProgID: ASCOM.OROFocus.Focuser

Error handling
Error number for “Not Implemented” is: 80040400
Error number for “Invalid Value 1” is: 80040404
Error number for “Value Not Set 1” is: 80040402
Error number for “Value Not Set 2” is: 80040403
Error messages will not be interpreted to infer state.

19:03:35.516 Driver Access Checks OK
19:03:36.166 AccessChecks OK Successfully created driver using late binding
19:03:37.950 AccessChecks OK Successfully connected using late binding
19:03:37.955 AccessChecks INFO The driver is a COM object
19:03:38.599 AccessChecks INFO Device does not expose IFocuser interface
19:03:45.313 AccessChecks INFO Device exposes IFocuserV2 interface
19:03:45.531 AccessChecks OK Successfully created driver using driver access toolkit
19:03:52.195 AccessChecks OK Successfully connected using driver access toolkit
19:03:52.312 AccessChecks OK Successfully disconnected using driver access toolkit

Conform is using ASCOM.DriverAccess.Focuser to get a Focuser object
19:03:52.438 ConformanceCheck OK Driver instance created successfully
19:03:59.148 ConformanceCheck OK Connected OK

Common Driver Methods
19:03:59.236 InterfaceVersion OK 2
19:03:59.265 Connected OK True
19:03:59.297 Description OK ASCOM Focuser Driver for OROFocus.
19:03:59.329 DriverInfo OK Orion Ranch Observatory focus motor controller. Version: 6.0
19:03:59.360 DriverVersion OK 6.0
19:03:59.393 Name OK ORO Focuser
19:03:59.423 CommandString INFO Conform cannot test the CommandString method
19:03:59.430 CommandBlind INFO Conform cannot test the CommandBlind method
19:03:59.437 CommandBool INFO Conform cannot test the CommandBool method
19:03:59.446 Action INFO Conform cannot test the Action method
19:03:59.458 SupportedActions OK Driver returned an empty action list

Properties
19:03:59.569 Absolute OK True
19:03:59.612 IsMoving OK False
19:03:59.620 MaxStep OK 10000
19:03:59.628 MaxIncrement OK 10000
19:03:59.724 Position OK 5000
19:03:59.744 StepSize OK Optional member threw a PropertyNotImplementedException exception.
19:03:59.755 TempCompAvailable OK False
19:03:59.763 TempComp Read OK False
19:03:59.776 TempComp Write OK Temperature compensation is not available and a PropertyNotImplementedException exception was generated as expected
19:03:59.787 Temperature OK Optional member threw a PropertyNotImplementedException exception.

Methods
19:03:59.852 Halt OK Focuser halted OK
19:03:59.964 Move Moving to position: 6000
19:04:06.315 Move OK Asynchronous move found
19:04:06.410 Move OK Absolute move OK
19:04:06.425 Move INFO Returning to original position: 5000

Conformance test complete

No errors, warnings or issues found: your driver passes ASCOM validation!!

Beo

Do you have logging in your driver? Can you attach a driver log and corresponding SGP log?

Thanks,
Jared

I turned logging on after the last failure and haven’t had it fail again, yet. It looks like the problem may be that it’s not making the secondary hysteresis move on that last step. We’ll know more if I can get it to fail again.

Thanks,

Beo

Are you referring to the backlash compensation? Might be worth disabling that if so just to test.

Thanks,
Jared

Yeah, I’m referring to the backlash setting. Looking at the log from last night, the one thing I’m seeing that may be the culprit is that I’m still getting a rounding error in the scaling in my stepper code where the target and the result are off by one when the reference goes negative. That’s a common problem for truncation, but I’m not sure why it’s happening here since I thought I had that handled properly. I’ll have to look into it further, but not today!

Thanks,

Beo

There is currently some logic in SGP where we validate the position of the motor. Essentially We told you to go to 1000, we'll wait until you get there even if you say you're done. I’ve been contemplating removing this logic. It is likely what is causing your issue if you’re having a rounding error. But if this is removed it probably won’t happen immediately.

Thanks,
Jared

Thanks Jared, that’s what I suspected. It’s something of a catch 22 on the check or not to check. In this case it could get by without the check, but if there was truly a failure that was way off, it’s good to have SGP report it. In my case I can fix it since I don’t like it reporting the wrong value and I know it’s just a math problem, but there are cases (e.g. servos a la the Celestron DEC cogging problem) where the positioner DOES reach the requested target, but then afterwards drifts away so that the reported position doesn’t match the target. In such a case, having an option to ignore the error would be good. Perhaps you can configure a user defined tolerance (-1 disables)?

Thanks,

Beo

BAH! Found the bug, and I was staring at it and not seeing it! My integer rounding code had the if/then backwards! It was doing:

if (target < 0) {  // round positive 
	return ((target * StepsPerRevolution * ScaleMultiplier / ScaleDivisor) + 0.5);
}
else {             // round negative
	return ((target * StepsPerRevolution * ScaleMultiplier / ScaleDivisor) - 0.5);
}

where the return value was an integer. The first section should have been the negative rounding, subtracting 0.5 then truncate vs. add and truncate for positive. The code in the driver that reverses this to return an integer to ASCOM has it right.

I’m still somewhat undecided if I still shouldn’t make a change to this. I designed the stepper drive to have completely scaleable units (multiplier and divider as shown above, although since it ended up as floating point math, I really don’t need both) and I somewhat arbitrarily chose the unit to be degrees on the output shaft of the stepper gear motor (not the focuser shaft). The problem is with 4096 steps per revolution a degree can’t be accurately represented on this stepper; 4096/360 = 11.3778 which rounds to 11 counts (was rounding to 10 with my bug!). Of course coming back the other way that becomes 0.9668. It’s not a big deal since we use absolute targets so the error doesn’t build, but nonetheless, it’s annoying. I could go back to just using raw steps, but it just makes all the numbers about 10x larger. Decisions, decisions!

Beo

Well, after applying the above fix and still having problems, I went through all the trouble of switching everything over to using raw stepper units so there couldn’t be any round-off error. It was only after doing that and seeing that the rate of motion didn’t change that I discovered that for some reason the installer wasn’t actually replacing the old DLL, even though it kept deleting the profile, etc. Thus I never tested any of the previous fixes. After manually deleting the DLL and re-installing, now it works. I’m not sure it’s worth going back to the degree based version at this point, although the numbers I’m dealing with now are excessive.

One thing I did encounter during the process and I’m not sure of the culprit. When SGP detects that it doesn’t reach the target it sits there hung (as far as the focuser thread) waiting for it for some time. Trying to get out of that at one point I disconnected and reconnected from the ASCOM driver. Best I could tell, that resulted in the previous keypresses I’d sent being forwarded and the focuser took off to never-never land. There was nothing I could do in SGP to stop it. I had to kill SGP and load my other test program to regain control and move it back where it belonged. That was a bit of a hairy experience!

Thanks,

Beo