Visual odometry for autonomous terrestrial, aerial and marine robots involves 3-D motion/trajectory computation by tracking features in video sequences. Where optical systems are ineffective due to restricted visibility and high turbidity levels, high-frequency 2-D forward-scan sonar can provide images with target details, and attractive trade-off in range, resolution, and data rate. Automated processing of sonar video images enables numerous key capabilities, including target tracking, 3-D estimation of camera and object motions, and scene analysis and classification in terms of target types and their structures, etc. We investigate the computation of 3-D sonar motion by tracking the 2-D images of scene features. The cases of pure rotation, pure translation, and general motion are analyzed separately to gain deeper insight into the inherent characteristics of the underlying ambiguities and complexities. We analyze performance based on results of experiments with both synthetic and real data.